1 Introduction

One of the main tools in optimization is duality theory, which associates to a given (primal) problem (\(\mathcal {P}\)) another (dual) problem (\(\mathcal {D} \)), in such a way that the relation between the two problems provides useful information about (\(\mathcal {P}\)). In the case of convex optimization problems, under suitable assumptions, the optimal values of (\(\mathcal {P}\)) and (\(\mathcal {D}\)) coincide and the primal optimal solutions can be recovered from the dual optimal solutions; this is particularly useful when (\(\mathcal {D}\)) happens to be easier to solve than (\(\mathcal {P}\)). The essential tools of convex duality theory are convex conjugation and the notion of subgradient; we refer to the classical book by Rockafellar [1] for a detailed treatment of classical convex duality. In the last decades, there has also been a very active research in nonconvex programming, motivated by the fact that many real life optimization problems are nonconvex. In this paper, we propose a generalization of one of the most classical nonconvex duality theories, namely, Toland–Singer duality for d.c. (difference of convex) functions [2,3,4], consisting in a multiduality principle involving a (possibly infinite) collection of mutually dual problems. Our simple multiduality result does not require any convexity assumption, and one of its consequences is a new duality theorem involving just two mutually dual problems, expressed in terms of classical Fenchel conjugates. This theorem generalizes the well-known Toland–Singer duality theorem, and we use it to characterize (approximate) global optimal solutions, thus generalizing a well-known necessary and sufficient global optimality condition due to Hiriart-Urruty [5]. Another application of the new theorem yields a duality result involving the generalized perspective functions introduced by Maréchal in [6].

There are some numerical algorithms based on Toland–Singer duality, such as the DC algorithm [7] and branch-and-bound/cutting-plane type algorithms [8, 9]. Following the approach by Tao and El Bernoussi [7], we propose an algorithm for searching local optimal solutions of nonconvex problems having the format considered in our duality theorem.

The paper consists of three more sections. In Sect. 2, we state the new multiduality principle, which is naturally formulated in the framework of generalized conjugation theory. Section 3 considers the special case when the collection of mutually dual problems consists of just two problems; for simplicity, this section is presented in the setting of classical convex conjugation. In Sect. 4, we propose an extension of the DC algorithm to the broader class of nonconvex problems considered in Sect. 3.

2 Multiduality

Let \(\left( X_{i}\right) _{i\in I}\) be a family of nonempty sets, indexed by \(I\ne \emptyset \). We denote \(X:=\prod _{i\in I}X_{i}\) and \( X_{-i}:=\prod _{j\in I{\setminus } \{i\}}X_{j}\ \)for \(i\in I.\) For \(y\in X_{-i}\) and \(z\in X_{i},\) we define \(x_{(-i,y),(i,z)}\in X\) by

$$\begin{aligned} \left( x_{(-i,y),(i,z)}\right) _{i}:=z,\ \left( x_{(-i,y),(i,z)}\right) _{j}:=y_{j}\quad \text{ for }\quad j\in I{\setminus } \{i\}. \end{aligned}$$
(1)

Given a function \(c:X\rightarrow \overline{\mathbb {R}}\), for each \(i\in I\) we consider the coupling function \(c_{i}:X_{-i}\times X_{i}\rightarrow \overline{\mathbb {R}}\) defined by

$$\begin{aligned} c_{i}(y,z):=c\left( x_{(-i,y),(i,z)}\right) . \end{aligned}$$

Notice that, for \(x\in X,\) one has \(c_{i}(x_{-i},x_{i})=c(x)\); here and throughout we denote by \(x_{-i}\) the projection of \(x\in X\) onto \(X_{-i}\), that is,

$$\begin{aligned} (x_{-i})_j:=x_j\quad \text {for}\quad j\in I{\setminus } \{i\}. \end{aligned}$$

Following the generalized conjugation scheme of [10], we define a new coupling function \(c_{i}^{\prime }:X_{i}\times X_{-i}\rightarrow \overline{\mathbb {R}}\) by

$$\begin{aligned} c_{i}^{\prime }(z,y):=c_{i}(y,z) \end{aligned}$$

and consider the conjugation operators associated with \(c_{i}\) and \( c_{i}^{\prime },\) namely the \(c_{i}\)-conjugate of \(g:X_{-i}\rightarrow \mathbb {\overline{R}}\) is the function \(g^{c_{i}}:X_{i}\rightarrow \mathbb { \overline{R}}\) defined by

$$\begin{aligned} g^{c_{i}}(z):=\sup _{y\in X_{-i}}\{c_{i}(y,z)-g(y)\}, \end{aligned}$$

and, analogously, the \(c_{i}^{\prime }\)-conjugate of \(h:X_{i}\rightarrow \mathbb {\overline{R}}\) is \(h^{c_{i}^{\prime }}:X_{-i}\rightarrow \mathbb { \overline{R}}, \) defined by

$$\begin{aligned} h^{c_{i}^{\prime }}(y):=\sup _{z\in X_{i}}\{c_{i}^{\prime }(z,y)-h(z)\}; \end{aligned}$$

here and in the sequel we adopt the conventions

$$\begin{aligned} +\infty +\left( -\infty \right) =-\infty +\left( +\infty \right) =+\infty -\left( +\infty \right) =-\infty -\left( -\infty \right) :=-\infty . \end{aligned}$$

For \(h:X_{i}\rightarrow \mathbb {\overline{R}},\) \(\overline{z}\in h^{-1}\left( {\mathbb {R}}\right) \) and \(\epsilon \ge 0,\) we set

$$\begin{aligned} \partial _{\epsilon }^{c_{i}^{\prime }}h\left( \overline{z}\right) :=\left\{ y\in X_{-i}\ :\ h\left( z\right) -h\left( \overline{z}\right) \ge c_{i}^{\prime }\left( z,y\right) -c_{i}^{\prime }\left( \overline{z} ,y\right) -\epsilon \quad \text {for every }z\in X_{i}\right\} . \end{aligned}$$

One can easily check that

$$\begin{aligned} \partial _{\epsilon }^{c_{i}^{\prime }}h\left( \overline{z}\right) =\left\{ y\in X_{-i}\ :\ -(h^{c_{i}^{\prime }}\left( y\right) -c_{i}^{\prime }\left( \overline{z},y\right) )\ge h\left( \overline{z}\right) -\epsilon \right\} . \end{aligned}$$

Given \(f:X\rightarrow \overline{\mathbb {R}}\), for \(i\in I\) and \({z}\in X_{i}\) we define \(f_{i,z}:X_{-i}\rightarrow \overline{\mathbb {R}}\) by

$$\begin{aligned} f_{i,z}(y):=f(x_{(-i,y),(i,z)}). \end{aligned}$$

Notice that, for \(x\in X,\) one has \(f_{i,x_{i}}(x_{-i})=f(x)\).

Theorem 2.1

Let \(f:X\rightarrow \overline{\mathbb {R}}.\) For every \( i\in I\) one has

$$\begin{aligned} \sup _{z\in X_{i}}\ f_{i,z}^{c_{i}}(z)=\sup _{x\in X}\{c(x)-f(x)\}. \end{aligned}$$

Hence, the optimal value of problem

$$\begin{aligned} (\mathcal {P}_{i}){\qquad }\sup _{z\in X_{i}}f_{i,z}^{c_{i}}(z) \end{aligned}$$

does not depend on i.

Proof

We have

$$\begin{aligned} \sup _{z\in X_{i}}\ f_{i,z}^{c_{i}}(z)= & {} \sup _{z\in X_{i}}\sup _{y\in X_{-i}}\{c_{i}(y,z)-f_{i,z}(y)\} \\= & {} \sup _{x\in X}\{c_{i}(x_{-i},x_{i})-f_{i,x_{i}}(x_{-i})\} \\= & {} \sup _{x\in X}\{c(x)-f(x)\}. \end{aligned}$$

\(\square \)

Corollary 2.1

Let \(G\subseteq X\) and, for \(i\in I\), denote by \( C_{i}:X_{i}\rightrightarrows X_{-i}\) the set-valued mapping defined by

$$\begin{aligned} {C_{i}(z):=\{y\in X_{-i}\ :\ x_{(-i,y),(i,z)}\in G\}.} \end{aligned}$$

Then

$$\begin{aligned} \sup _{z\in X_{i}}\delta _{C_{i}(z)}^{c_{i}}(z)=\sup _{x\in G}c(x). \end{aligned}$$

Proof

Apply Theorem 2.1 with \(f:=\delta _{G}\). \(\square \)

Corollary 2.2

Let \(G_{i}\subseteq X_{i}\) for every \(i\in I.\) Then

$$\begin{aligned} \sup _{z\in G_{i}}\delta _{G_{-i}}^{c_{i}}(z)=\sup _{x\in G}c(x). \end{aligned}$$

Proof

Apply Corollary 2.1 with \(G:=\prod _{i\in I}X_{i}.\) \(\square \)

Corollary 2.3

Let \(f_{i}:X_{i}\rightarrow \mathbb {R}\cup \{+\infty \} ( i=1,\ldots ,n)\), and define

$$\begin{aligned} f:\displaystyle \prod \limits _{i=1}^{n}X_{i} \rightarrow \mathbb {R}\cup \{+\infty \} \quad \text {by} \quad f\left( x_{1},\ldots ,x_{n}\right) :=\displaystyle \sum \limits _{i=1}^{n}f_{i} \left( x_{i}\right) . \end{aligned}$$

Then, for every \(i\in I,\) one has

$$\begin{aligned} \sup _{z\in X_{i}}\left\{ \left( \displaystyle \sum \limits _{\begin{array}{c} j=1 \\ j\ne i \end{array}}^{n}f_{j}\right) ^{c_{i}}(z)-f_{i}\left( z\right) \right\} =\sup _{x\in X}\{c(x)-f(x)\}. \end{aligned}$$

Theorem 2.2

Assume that the optimal value of \((\mathcal {P}_{i})\) is finite, and let \(\epsilon ,\eta \ge 0.\) If \(\bar{z}\) is an \(\epsilon \)-optimal solution to \((\mathcal {P}_{i})\) such that \(f_{i,\bar{z} }^{c_{i}c_{i}^{\prime }}=f_{i,\bar{z}}\), then, for every \(\bar{y}\in \partial _{\eta }f_{i,\bar{z}}^{c_{i}}(\bar{z})\) and \(j\in I{\setminus } \{i\},\) the point \(\bar{y}_{j}\) is an \((\epsilon +\eta )\)-optimal solution to \( \left( \mathcal {P}_{j}\right) \).

Proof

We have

$$\begin{aligned} f_{j,\bar{y}_{j}}^{c_{j}}(\bar{y}_{j})\ge & {} c_{j}\left( x_{(i,\bar{z} ),(I{\setminus } \{i,j\},\bar{y}_{-j})},\bar{y}_{j}\right) -f_{j,\bar{y} _{j}}(x_{(i,\bar{z}),(I{\setminus } \{i,j\},\bar{y}_{-j})}) \\= & {} c(x_{(-i,\bar{y}),(i,\bar{z})})-f(x_{(i,\bar{z}),(-i,\bar{y})}) \\= & {} c_{i}(\overline{y},\overline{z})-f_{i,\bar{z}}(\bar{y})=c_{i}^{\prime }( \overline{z},\overline{y})-f_{i,\bar{z}}(\bar{y}) \\= & {} c_{i}^{\prime }(\overline{z},\overline{y})-f_{i,\bar{z} }^{c_{i}c_{i}^{\prime }}(\bar{y})\ge f_{i,\bar{z}}^{c_{i}}(\bar{z})-\eta \\\ge & {} \sup _{z\in X_{i}}f_{i,z}^{c_{i}}(z)-\epsilon -\eta =\sup _{v\in X_{j}}f_{j,v}^{c_{j}}(v)-(\epsilon +\eta ), \end{aligned}$$

the last equality being an immediate consequence of Theorem 2.1. \(\square \)

3 Nonconvex Duality

3.1 A Generalization of Toland–Singer Duality

Let \((X_{1},X_{2},\langle \cdot ,\cdot \rangle )\) be a dual pair of locally convex spaces and \(f:X_{1}\times X_{2}\rightarrow \mathbb {\overline{R}}\). In this section, we will apply the general results of Sect. 2 to the special case when \(I:=\left\{ 1,2\right\} \) and c is the duality pairing \(\langle \cdot ,\cdot \rangle \).

We recall that the Fenchel conjugates of \(g:X_{2}\rightarrow \mathbb {\overline{R}} \) and \(h:X_{1}\rightarrow \mathbb {\overline{R}}\) are the functions \(g^{*}:X_{1}\rightarrow \mathbb {\overline{R}}\) and \(h^{*}:X_{2}\rightarrow \mathbb {\overline{R}},\) respectively, defined by

$$\begin{aligned} g^{*}\left( z\right) :=\sup _{y\in X_{2}}\left\{ \left\langle z,y\right\rangle -g\left( y\right) \right\} ,{\qquad }h^{*}\left( y\right) :=\sup _{z\in X_{1}}\left\{ \left\langle z,y\right\rangle -h\left( z\right) \right\} . \end{aligned}$$

It is easy to see that, for the coupling function c considered in this section, one has \(g^{c_{1}}=g^{*}\) and \(h^{c_{2}}=h^{*}.\) Thus, Theorem 2.1 and Corollaries 2.1 and 2.2 yield:

Corollary 3.1

Let \(f:X_{1}\times X_{2}\rightarrow \mathbb {\overline{R}}.\) Then

$$\begin{aligned} \sup _{z\in X_{1}}\ f(z,\cdot )^{*}(z)=\sup _{\left( z,y\right) \in X_{1}\times X_{2}}\left\{ \left\langle z,y\right\rangle -f\left( z,y\right) \right\} =\sup _{y\in X_{2}}\ f(\cdot ,y)^{*}(y). \end{aligned}$$

Corollary 3.2

Let \(T:X_1\rightrightarrows X_2.\) Then

$$\begin{aligned} \sup _{z\in X_{1}}\delta _{T(z)}^{*}(z)=\sup _{\left( {z,y}\right) {\in } Graph\ T}\left\langle z,y\right\rangle =\sup _{y\in X_{2}}\delta _{T^{-1}(y)}^{*}(y). \end{aligned}$$

Proof

Apply Corollary 2.1 with \(G:=Graph\ T.\)

Corollary 3.3

Let \(Z\subseteq X_{1}\) and \(Y\subseteq X_{2}.\) Then

$$\begin{aligned} \sup _{z\in Z}\delta _{Y}^{*}(z)=\sup _{x\in Z\times Y}\left\langle z,y\right\rangle =\sup _{y\in Y}\delta _{Z}^{*}(y). \end{aligned}$$

Similarly, from Corollary 2.3 or, alternatively, Corollary 3.1 (by setting \(f\left( z,y\right) :=h\left( z\right) +g\left( y\right) \)), one obtains the following version of the classical Toland–Singer duality theorem [2, 4]:

Corollary 3.4

Let \(g:X_{2}\rightarrow \mathbb {\overline{R}}\) and \(h:X_{1}\rightarrow \mathbb {\overline{R}}\). Then one has

$$\begin{aligned} \sup _{z\in X_{1}}\left\{ g^{*}(z)-h\left( z\right) \right\} =\sup _{\left( z,y\right) \in X_{1}\times X_{2}}\left\{ \left\langle z,y\right\rangle -h\left( z\right) -g\left( y\right) \right\} =\sup _{y\in X_{2}}\left\{ h^{*}(y)-g\left( y\right) \right\} . \end{aligned}$$

From the preceding corollary, by taking \(g:=k^{*}\), with \(k:X_{1}\rightarrow \mathbb {\overline{R}}\) such that \(k^{**}=k\), one immediately obtains the standard Toland–Singer formula.

We recall that, for \(\eta \ge 0,\) the \(\eta \)-subdifferential of \( h:X_{1}\rightarrow \mathbb {\overline{R}}\) at \(\bar{z}\in h^{-1}\left( \mathbb {R}\right) \) is the set

$$\begin{aligned} \partial _{\eta }h\left( \bar{z}\right) :=\left\{ y\in X_{2}{\ :\ h}\left( z\right) \ge h\left( \bar{z}\right) +\left\langle z-\bar{z},y\right\rangle -\eta \quad \text {for every }z\in X_{1}\right\} . \end{aligned}$$

From Theorem 2.2, one gets the following result on approximate optimal solutions of the pair of dual problems

$$\begin{aligned} \left( \mathcal {P}\right) \qquad \qquad \qquad \qquad \qquad \text {maximize } f(z,\cdot )^{*}(z) \end{aligned}$$

and

$$\begin{aligned} \left( \mathcal {D}\right) \qquad \qquad \qquad \qquad \qquad \text {maximize }f(\cdot ,y)^{*}(y) \end{aligned}$$

of Corollary 3.1. A related result, showing how to obtain optimal dual solutions from optimal primal solutions of nonconvex problems, can be found in [3, Theorem 2.4].

Corollary 3.5

Assume that the optimal value of \((\mathcal {P})\) is finite, and let \( \epsilon ,\eta \ge 0.\) If \(\bar{z}\) is an \(\epsilon \)-optimal solution to \(( \mathcal {P})\) such that \(f\left( \overline{z},\cdot \right) \) is convex and l.s.c. then, for every \(\bar{y}\in \partial _{\eta }f\left( \overline{z} ,\cdot \right) ^{*}(\bar{z}),\) the point \(\bar{y}\) is an \((\epsilon +\eta )\)-optimal solution to \(\left( \mathcal {D}\right) \).

The following characterization of approximate global optimal solutions generalizes a well-known necessary and sufficient global optimality condition due to Hiriart-Urruty [5].

Theorem 3.1

Assume that the optimal value of \((\mathcal {P})\) is finite, and let \(\bar{z}\in X_{1}\) and \(\epsilon \ge 0.\) Then \(\bar{z}\) is an \(\epsilon \)-optimal solution to \((\mathcal {P})\) if and only if for every \( \eta \ge 0\) and every \(y\in f(\bar{z},\cdot )^{-1}\left( \mathbb {R}\right) \) such that \(\bar{z}\in \partial _{\eta }f(\bar{z},\cdot )({y})\) one has

$$\begin{aligned} y\in \partial _{\epsilon +\eta }f(\cdot ,{y})(\bar{z}). \end{aligned}$$
(2)

Proof

Let \(\bar{z}\) be an \(\epsilon \)-optimal solution to \((\mathcal {P}),\) and \( \eta \ge 0\) and \(y\in f(\bar{z},\cdot )^{-1}\left( \mathbb {R}\right) \) be such that \(\bar{z}\in \partial _{\eta }f(\bar{z},\cdot )({y}).\) Using Corollary 3.1, we obtain

$$\begin{aligned} f\left( \cdot ,y\right) (\bar{z})= & {} f(\bar{z},\cdot )\left( y\right) \le \left\langle \bar{z},y\right\rangle +\eta -f(\bar{z},\cdot )^{*}(\bar{z}) \\\le & {} \left\langle \bar{z},y\right\rangle +\eta -\sup _{z\in X_{1}}\ f(z,\cdot )^{*}+\epsilon \le \left\langle \bar{z},y\right\rangle +\eta -f(\cdot ,y)^{*}({y})+\epsilon , \end{aligned}$$

which proves (2).

Conversely, assume that the condition stated after “if and only if” holds, let \(y\in f(\bar{z},\cdot )^{-1}\left( \mathbb {R}\right) ,\) and define \(\eta :=f(\bar{z},y)+f(\bar{z},\cdot )^{*}(\bar{z})-\left\langle \bar{z},y\right\rangle \). Then \(\eta \ge 0\) and \( \bar{z}\in \partial _{\eta }f(\bar{z},\cdot )({y}),\) and therefore (2) holds. Hence

$$\begin{aligned} f(\bar{z},\cdot )^{*}(\bar{z})=\left\langle \bar{z},y\right\rangle +\eta -f(\bar{z},y)\ge f(\cdot ,y)^{*}({y})-\epsilon , \end{aligned}$$

which shows that \(\bar{z}\) is an \(\epsilon \)-optimal solution to \((\mathcal {P }).\) \(\square \)

To apply the duality theory presented in this section to a given optimization problem, one has to be able to recognize whether the objective function of the problem under consideration can be written in the form \( f\left( z,\cdot \right) ^{*}\left( z\right) \) for some function \( f:X_{1}\times X_{2}\rightarrow \mathbb {\overline{R}}\). The following theorem provides a useful criterion to make this recognition possible.

Theorem 3.2

For every \(f:X_{1}\times X_{2}\rightarrow \mathbb {\overline{R}},\) the function \(\varphi :X_{1}^{2}\rightarrow \overline{\mathbb {R}}\) defined by \( \varphi (z,z^{\prime }):=f\left( z,\cdot \right) ^{*}\left( z^{\prime }\right) \) is convex, proper (accepting as proper the constant functions \( +\infty \) and \(-\infty \)) and l.s.c. in its second argument. Conversely, for every function \(\varphi :X_{1}^{2}\rightarrow \overline{\mathbb {R}}\) with these properties there exists \(f:X_{1}\times X_{2}\rightarrow \mathbb { \overline{R}}\) such that \(f\left( z,\cdot \right) ^{*}\left( z\right) =\varphi (z,z)\) for every \(z\in X_{1}\).

Proof

The first part of the statement is an immediate consequence of well-known properties of conjugate functions. To prove the converse, define \( f:X_{1}\times X_{2}\rightarrow \overline{\mathbb {R}}\) by \(f\left( z,y\right) :=\varphi \left( z,\cdot \right) ^{*}\left( y\right) ;\) then, since \( \varphi \left( z,\cdot \right) \) is convex, proper and l.s.c, from the equality \(f\left( z,\cdot \right) :=\varphi \left( z,\cdot \right) ^{*}\) it follows that \(\varphi \left( z,\cdot \right) :=f\left( z,\cdot \right) ^{*},\) and therefore \(\varphi \left( z,z\right) :=f\left( z,\cdot \right) ^{*}\left( z\right) \). \(\square \)

According to the preceding theorem, the class of problems

$$\begin{aligned} \left( \mathcal {P}\right) \qquad \qquad \qquad \qquad \text {maximize }f\left( z,\cdot \right) ^{*}\left( z\right) \end{aligned}$$

to which our duality theory applies coincides with that consisting of problems with format

$$\begin{aligned} \left( \mathcal {P}\right) \qquad \qquad \qquad \qquad \qquad \qquad \text {maximize }\varphi (z,z), \qquad \qquad \qquad \qquad \qquad \end{aligned}$$
(3)

the function \(\varphi :X_{1}^{2}\rightarrow \overline{\mathbb {R}}\) being as in the statement. Since, in view of the proof, the objective function of these two problems are linked by the relation \(f\left( z,\cdot \right) :=\varphi \left( z,\cdot \right) ^{*},\) a straightforward computation yields the dual objective function in terms of \(\varphi :\)

$$\begin{aligned} f\left( \cdot ,y\right) ^{*}\left( y\right) =\sup _{z\in X_{1}}\left\{ \left\langle z,y\right\rangle -f\left( z,y\right) \right\} =\sup _{z\in X_{1}}\left\{ \left\langle z,y\right\rangle -\varphi \left( z,\cdot \right) ^{*}\left( y\right) \right\} . \end{aligned}$$
(4)

Hence, if the primal problem is stated as (3), the formulation of the dual problem is

$$\begin{aligned} \left( \mathcal {D}\right) \qquad \qquad \qquad \qquad {\qquad }\sup _{z\in X_{1}}\left\{ \left\langle z,y\right\rangle -\varphi \left( z,\cdot \right) ^{*}\left( y\right) \right\} . \end{aligned}$$

To illustrate that solving the dual problem may be advantageous, we present the following academic example, in which the primal problem consists in maximizing a nonconcave function and, on the contrary, the dual objective function is concave.

Example 3.1

Let us consider the primal problem

$$\begin{aligned} \left( \mathcal {P}\right) \qquad \qquad \qquad \qquad \qquad \text {maximize }\alpha (z), \end{aligned}$$

where \(\alpha \) is a nonconcave function defined on a normed space \(X_{1}\). Let us define \(\varphi :X_{1}^{2}\rightarrow \overline{\mathbb {R}}\) by

$$\begin{aligned} \varphi (z,z^{\prime }):=\alpha (z)-||z||^{2}+\left\langle z^{\prime },z\right\rangle +\delta _{NB}(z^{\prime }-z); \end{aligned}$$

here B is the closed unit ball in \(X_{1}\) and \(N>0.\) Clearly, \(\varphi (z,z)=\alpha (z),\) and therefore \(\left( \mathcal {P}\right) \) can be rewritten as (3). Since

$$\begin{aligned} \varphi \left( z,\cdot \right) ^{*}\left( y\right)= & {} \sup _{z^{\prime }\in X_{1}}\left\{ \left\langle z^{\prime },y\right\rangle -\varphi \left( z,z^{\prime }\right) \right\} \\= & {} \sup _{z^{\prime }\in X_{1}}\left\{ \left\langle z^{\prime },y-z\right\rangle +\delta _{NB}(z^{\prime }-z)\right\} -\alpha (z)+||z||^{2} \\= & {} \sup _{v\in X_{1}}\left\{ \left\langle z+v,y-z\right\rangle -\delta _{NB}(v)\right\} -\alpha (z)+||z||^{2} \\= & {} \sup _{v\in NB}\left\langle v,y-z\right\rangle +\left\langle z,y\right\rangle -\alpha (z) \\= & {} N||y-z||+\left\langle z,y\right\rangle -\alpha (z), \end{aligned}$$

according to (4) we get the following expression for the dual objective function:

$$\begin{aligned} f\left( \cdot ,y\right) ^{*}\left( y\right) =\sup _{z\in X_{1}}\left\{ \left\langle z,y\right\rangle -\varphi \left( z,\cdot \right) ^{*}\left( y\right) \right\} =\sup _{z\in X_{1}}\left\{ \alpha (z)-N||y-z||\right\} , \end{aligned}$$

which, in view of [11, p. 200], shows that it is the smallest N-Lipschitz majorant of \(\alpha \). This N-Lipschitz envelope is concave in some cases; for instance, let us consider the case when \(X_{1}:=\mathbb {R},\)

$$\begin{aligned} \alpha (z):=-2z^{6}+15z^{4}-36z^{2} \end{aligned}$$

and \(N:=10\). Then one can easily prove that

$$\begin{aligned} f\left( \cdot ,y\right) ^{*}\left( y\right) =\left\{ \begin{array}{ll} 10y+r+\alpha (-r), &{}\quad \text{ if }\quad y\le -r, \\ \alpha (y),\quad \quad \quad \quad &{} \quad \text{ if }\quad y\in [-r,r],\\ -10y+r+\alpha (r), &{}\quad \text{ if }\quad y\ge -r. \end{array} \right. \end{aligned}$$

with r being the smallest positive real number satisfying \(\alpha ^{\prime }(-r)=10\).

Figure 1 depicts the graphs of both \(\alpha \) and \( y\rightarrow f\left( \cdot ,y\right) ^{*}\left( y\right) \) and shows that the latter function is concave.

Fig. 1
figure 1

Graph of \(\alpha (y)\) (black line) and \(y\rightarrow f\left( \cdot ,y\right) ^{*}\left( y\right) \) (dash line)

3.2 Example: Generalized Perspective Functions

Let \(g:X_{2}\rightarrow \left] 0,+\infty \right[ \) and \(h:X_{1}\rightarrow \left] 0,+\infty \right[ ,\) and define \(f:X_{1}\times X_{2}\rightarrow \left] 0,+\infty \right[ \) by \(f(z,y):=h(z)g(y)\). For this particular function f,  the objective functions of problems \((\mathcal {P})\) and \((\mathcal {D})\) can be expressed by means of the operation \(\Delta \) introduced in [6], which constitutes a generalization of the so called perspective function of Convex Analysis. Indeed, for \(z\in X_{1}\) one has \(f(z,\cdot )=h(z)g,\) and hence

$$\begin{aligned} f(z,\cdot )^{*}(z)=\left( h(z)g\right) ^{*}(z)=h(z)g^{*}\left( \frac{z}{h(z)}\right) =\left( g^{*}\Delta h\right) \left( z,z\right) . \end{aligned}$$

Similarly, for \(y\in X_{2}\) one has

$$\begin{aligned} f(\cdot ,y)^{*}\left( y\right) =\left( g(y)h\right) ^{*}\left( y\right) =g(y)h^{*}\left( \frac{y}{g(y)}\right) =\left( h^{*}\Delta g\right) \left( y,y\right) . \end{aligned}$$

Thus, Theorem 3.1 yields:

Corollary 3.6

Let \(g:X_{2}\rightarrow \left] 0,+\infty \right[ \) and \(h:X_{1}\rightarrow ] 0,+\infty [\). Then

$$\begin{aligned} \sup _{z\in X_{1}}\left( g^{*}\Delta h\right) \left( z,z\right) =\sup _{y\in X_{2}}\left( h^{*}\Delta g\right) \left( y,y\right) . \end{aligned}$$

4 An Algorithm

In the setting of the preceding section, in this one we will assume that \( X_{1}\) and \(X_{2}\) are the Euclidean space \(\mathbb {R}^{n}\) and \(\langle \cdot ,\cdot \rangle \) is the standard Euclidean product. Let \(f:\mathbb {R} ^{n}\times \mathbb {R}^{n}\rightarrow \mathbb {\overline{R}},\) and consider the sets

$$\begin{aligned} \mathcal {P}_{l}=\{\bar{z}\in \mathbb {R}^{n}:y\in \mathbb {R}^{n}\quad \text { and }\quad \bar{z}\in \partial f(\bar{z},\cdot )(y)\ \text {imply}\ y\in \partial f(\cdot ,y)(\bar{z})\} \end{aligned}$$

and

$$\begin{aligned} \mathcal {D}_{l}=\{\bar{y}\in \mathbb {R}^{n}:z\in \mathbb {R}^{n}\quad \text { and }\quad \bar{y}\in \partial f(\cdot ,\bar{y})(z)\ \text {imply}\ z\in \partial f(z,\cdot )(\bar{y})\}. \end{aligned}$$

According to Theorem 3.1, the set \(\mathcal {P}_{l}\) contains all the optimal solutions to \((\mathcal {P})\) and the set \(\mathcal {D }_{l}\) contains all the optimal solutions to \((\mathcal {D})\).

For \({\bar{z}}\in \mathbb {R}^{n}\) and \({\bar{y}}\in \mathbb {R}^{n},\) we consider the auxiliary problems

$$\begin{aligned} {S}{(\bar{z})}\qquad \text {maximize }f(\cdot ,y)^{*}(y)\text { subject to }y\in \partial f(\bar{z},\cdot )^{*}(\bar{z}) \end{aligned}$$

and

$$\begin{aligned} {T}{(\bar{y})}\qquad \text {maximize }f(z,\cdot )^{*}(z)\text { subject to }z\in \partial f(\cdot ,\bar{y})^{*}(\bar{y}). \end{aligned}$$

We will denote by \(\mathcal {O}_{S(\bar{z})}\) and \(\mathcal {O}_{T(\bar{y})}\) the sets of optimal solutions to problems \({S}{(\bar{z})}\) and \({T}{(\bar{y}) }\), respectively.

Proposition 4.1

  1. (1)

    If \(\bar{z}\in \mathcal {P}_{l}\) and \(f(\bar{z},\cdot )\) is convex, proper and l.s.c., then the function \(y\mapsto f(\cdot ,y)^{*}(y) \) is constant on \(\partial f(\bar{z},\cdot )^{*}(\bar{z})\).

  2. (2)

    If \(\bar{y}\in \mathcal {D}_{l}\) and \(f(\cdot ,\bar{y})\) is convex, proper and l.s.c., then the function \(z\mapsto f(z,\cdot )^{*}(z) \) is constant on \(\partial f(\cdot ,\bar{y})^{*}(\bar{y})\).

Proof

By symmetry, we only need to prove (1). Let \(\bar{z}\in \mathcal {P} _{l}.\) If \(y\in \) \(\partial f(\bar{z},\cdot )^{*}(\bar{z}),\) then

$$\begin{aligned} f(\bar{z},\cdot )^{*}(\bar{z})=\left\langle \bar{z},y\right\rangle -f( \bar{z},\cdot )^{**}\left( y\right) =\left\langle \bar{z} ,y\right\rangle -f(\bar{z},y) \end{aligned}$$
(5)

and, since \(y\in \partial f(\cdot ,y)(\bar{z})\) (as \(\bar{z}\in \mathcal {P} _{l}),\)

$$\begin{aligned} f(\cdot ,y)^{*}(y)=\left\langle \bar{z},y\right\rangle -f(\bar{z},y). \end{aligned}$$
(6)

From (6) and (5) it immediately follows that

$$\begin{aligned} f(\cdot ,y)^{*}(y)=f(\bar{z},\cdot )^{*}(\bar{z}). \end{aligned}$$
(7)

\(\square \)

Starting with an initial point \(z^{0}\in \mathbb {R}^{n},\) we construct two sequences \(z^{k}\) and \(y^{k}\) as follows:

$$\begin{aligned} \begin{array}{ccc} z^{0} &{} \mapsto &{} y^{0}\in \mathcal {O}_{S(z^{0})} \\ z^{1}\in \mathcal {O}_{T(y^{0})} &{} \mapsto &{} y^{1}\in \mathcal {O}_{S(z^{1})} \\ \vdots &{} &{} \vdots \\ z^{k+1}\in \mathcal {O}_{T(y^{k})} &{} \mapsto &{} y^{k+1}\in \mathcal {O} _{S(z^{k+1})}. \end{array} \end{aligned}$$
(8)

This algorithm requires the auxiliary problems occurring at each iteration to be solvable, and from now on we will assume that this is the case. The following proposition states sufficient conditions for solvability of these problems. We recall that a function is called co-finite if its conjugate is finite valued everywhere.

Proposition 4.2

Let \({\bar{z}}\in \mathbb {R}^{n}\) and \({\bar{y}}\in \mathbb {R}^{n},\) and assume that f is continuous. If \(f(\bar{z},\cdot )\) is co-finite and the mapping \(y\longmapsto \)\({domf(\cdot ,y)}\) is compact-valued and continuous, then \(\mathcal {O}_{S(\bar{z})}\ne \emptyset \). Analogously, if \(f(\cdot ,\bar{y})\) is co-finite and the mapping \(z\longmapsto domf(z,\cdot )\) is compact-valued and continuous, then \(\mathcal {O}_{T(\bar{y})}\ne \emptyset \).

Proof

The co-finiteness of \(f(\bar{z},\cdot )\) guarantees that the feasible set \( \partial f(\bar{z},\cdot )^{*}(\bar{z})\) of \({S}{(\bar{z})}\) is nonempty and compact. On the other hand, continuity of f together with compact-valuedness and continuity of the mapping \(y\longmapsto {domf(\cdot ,y)}\) implies, by Berge’ maximum theorem, continuity of the objective function \(y\longmapsto f(\cdot ,y)^{*}(y).\) It thus suffices to apply Weierstrass’ extreme value theorem to conclude that \(\mathcal {O}_{S(\bar{z} )}\ne \emptyset \). The proof of the nonemptiness of \(\mathcal {O}_{T(\bar{y} )}\) is the same, mutatis mutandis. \(\square \)

Theorem 4.1

Let \(f:\mathbb {R}^{n}\times \mathbb {R} ^{n}\rightarrow \mathbb {R}\cup \{+\infty \}\) and let the sequences \(z^{k}\) and \(y^{k}\) be constructed according to (8). Then

$$\begin{aligned} f\left( z^{k+1},\cdot \right) ^{*}\left( z^{k+1}\right) \ge f\left( \cdot ,y^{k}\right) ^{*}\left( y^{k}\right) \ge f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right) . \end{aligned}$$
(9)

The first inequality holds with the equal sign if and only if \(z^{k+1}\in \partial f(z^{k+1},\cdot )(y^{k}),\) in which case, assuming that the function \(f(\cdot ,y^{k})\) is convex, proper and l.s.c., we have y \(^{k}\in \mathcal {D}_{l}\). The second inequality holds with the equal sign if and only if \(y^{k}\in \partial f(\cdot ,y^{k})(z^{k}),\) in which case, assuming that the function \( f(z^{k},\cdot )\) is convex, proper and l.s.c., we have z \(^{k}\in \mathcal { P}_{l}\).

Proof

Because of Fenchel inequality and the relations \(z^{k+1}\in \partial f(\cdot ,y^{k})^{*}(y^{k})\) and \(y^{k}\in \partial f(z^{k},\cdot )^{*}(z^{k}) \), we obtain

$$\begin{aligned} f\left( z^{k+1},\cdot \right) ^{*}\left( z^{k+1}\right)\ge & {} \left\langle z^{k+1},y^{k}\right\rangle -f\left( z^{k+1},y^{k}\right) \\= & {} f\left( \cdot ,y^{k}\right) ^{*}\left( y^{k}\right) \\\ge & {} \left\langle z^{k},y^{k}\right\rangle -f\left( z^{k},y^{k}\right) \\= & {} f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right) . \end{aligned}$$

The ‘if and only if” assertions follow from the above chain of inequalities, combined with the well-known characterization of subgradients as those elements that satisfy the Fenchel inequality with the equal sign.

Let us assume that the first inequality holds with the equal sign, and let \( z\in \mathbb {R}^{n}\) be such that \(y^{k}\in \partial f(\cdot ,y^{k})(z)\). Since we have \(z^{k+1}\in \mathcal {O}_{T(y^{k})}\) and \(z^{k+1}\in \partial f(z^{k+1},\cdot )(y^{k}),\) using that \(f(\cdot ,y^{k})\) is convex, proper and l.s.c., we obtain

$$\begin{aligned} f(z,\cdot )^{*}(z)\le & {} f\left( z^{k+1},\cdot \right) ^{*}\left( z^{k+1}\right) =\left\langle z^{k+1},y^{k}\right\rangle -f\left( z^{k+1},y^{k}\right) \\= & {} \left\langle z^{k+1},y^{k}\right\rangle -f\left( \cdot ,y^{k}\right) ^{**}\left( z^{k+1}\right) \le \left\langle z,y^{k}\right\rangle -f\left( \cdot ,y^{k}\right) ^{**}\left( z\right) \\= & {} \left\langle z,y^{k}\right\rangle -f\left( z,y^{k}\right) \le f(z,\cdot )^{*}(z); \end{aligned}$$

therefore \(f(z,\cdot )^{*}(z)=\left\langle z,y^{k}\right\rangle -f(z,y^{k}),\) that is, \(z\in \partial f(z,\cdot )(y^{k}).\) This shows that y \(^{k}\in \mathcal {D}_{l}\).

Let us assume that the second inequality holds with the equal sign, and let \( y\in \mathbb {R}^{n}\) be such that \(z^{k}\in \partial f(z^{k},\cdot )(y)\). Since we have \(y^{k}\in \mathcal {O}_{S(z^{k})}\) and \(y^{k}\in \partial f(\cdot ,y^{k})(z^{k}),\) using that \(f(z^{k},\cdot )\) is convex, proper and l.s.c. and the fact that \(z^{k}\in \partial f(z^{k},\cdot )^{**}(y),\) we obtain

$$\begin{aligned} f(\cdot ,y)^{*}(y)\le & {} f\left( \cdot ,y^{k}\right) ^{*}(y^{k})=\left\langle z^{k},y^{k}\right\rangle -f\left( z^{k},y^{k}\right) =\left\langle z^{k},y^{k}\right\rangle -f\left( z^{k},\cdot \right) ^{**}\left( y^{k}\right) \\\le & {} \left\langle z^{k},y\right\rangle -f\left( z^{k},\cdot \right) ^{**}\left( y\right) =\left\langle z^{k},y\right\rangle -f\left( z^{k},y\right) \le f(\cdot ,y)^{*}(y); \end{aligned}$$

therefore \(f(\cdot ,y)^{*}(y)=\left\langle z^{k},y\right\rangle -f(z^{k},y),\) that is, \(y\in \partial f(\cdot ,y)(z^{k}).\) This shows that \( z^{k}\in \mathcal {P}_{l}\). \(\square \)

Corollary 4.1

  1. (1)

    Let f\(z^{k}\) and \(y^{k}\) be as in Theorem 4.1. Then

    $$\begin{aligned} \lim _{k\rightarrow \infty }f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right) =\lim _{k\rightarrow \infty }f\left( \cdot ,y^{k}\right) ^{*}\left( y^{k}\right) . \end{aligned}$$
    (10)
  2. (2)

    If f is continuous, \(\overline{z}\) is a cluster point of the sequence \(z^{k},\) and the sequence \(y^{k}\) is bounded, then

    $$\begin{aligned} f(\overline{z},\cdot )^{*}(\overline{z})=\lim _{k\rightarrow \infty }f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right) . \end{aligned}$$
    (11)
  3. (3)

    If f is continuous, \(\overline{y}\) is a cluster point of the sequence \(y^{k},\) and the sequence \(z^{k}\) is bounded, then

    $$\begin{aligned} f(\cdot ,\overline{y})^{*}(\overline{y})=\lim _{k\rightarrow \infty }f\left( \cdot ,y^{k}\right) ^{*}\left( y^{k}\right) . \end{aligned}$$

Proof

(1) The equality (10) is an immediate consequence of (9).

(2) If f is continuous, then the function \(z\mapsto f(z,\cdot )^{*}(z)\) is l.s.c.. If \(\overline{z}\) is a cluster point of \(z^{k}\), then

$$\begin{aligned} \lim _{j\rightarrow \infty }z^{k_{j}}=\overline{z} \end{aligned}$$

for some subsequence \(z^{k_{j}}.\) Since the sequence \(y^{k}\) is bounded, we can suppose (extracting again subsequences if necessary) that the sequence \( y^{k_{j}}\) converges to a point \(\overline{y}\). Using the chain of inequalities at the beginning of the proof of Theorem 4.1, the continuity of f,  and the lower semicontinuity of the function \(z\mapsto f(z,\cdot )^{*}(z),\) we obtain

$$\begin{aligned} \lim _{k\rightarrow \infty }f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right)= & {} \lim _{j\rightarrow \infty }f\left( z^{k_{j}},\cdot \right) ^{*}\left( z^{k_{j}}\right) =\lim _{j\rightarrow \infty }\left( \left\langle z^{k_{j}},y^{k_{j}}\right\rangle -f\left( z^{k_{j}},y^{k_{j}}\right) \right) \\= & {} \left\langle \overline{z},\overline{y}\right\rangle -f(\overline{z}, \overline{y})\le f(\overline{z},\cdot )^{*}(\overline{z})\le \lim _{j\rightarrow \infty }f\left( z^{k_{j}},\cdot \right) ^{*}\left( z^{k_{j}}\right) \\= & {} \lim _{k\rightarrow \infty }f\left( z^{k},\cdot \right) ^{*}\left( z^{k}\right) , \end{aligned}$$

which proves (11).

The proof of (3) is similar to that of (2). \(\square \)

Theorem 4.2

Let f be continuous.

  1. (1)

    If f is convex and proper in its first argument, \( \overline{z}\) is a cluster point of \(z^{k}\), and the sequence \(y^{k}\) is bounded, then \(\overline{z}\in \mathcal {P}_{l}\).

  2. (2)

    If f is convex and proper in its second argument, \( \overline{y}\) is a cluster point of \(y^{k}\), and the sequence \(z^{k}\) is bounded, then \(\overline{y}\in \mathcal {D}_{l}\).

Proof

(1) Let \(y\in \mathbb {R}^{n}\) be such that \(\overline{z}\in \partial f(\overline{z},\cdot )(y).\) Since \(\overline{z}\) is a cluster point of \(z^{k}\), we have

$$\begin{aligned} \lim _{j\rightarrow \infty }z^{k_{j}}=\overline{z} \end{aligned}$$

for some subsequence \(z^{k_{j}}.\) We can suppose (extracting subsequences if necessary) that the sequence \(y^{k_{j}-1}\) converges to a point \(\overline{y} \in \mathbb {R}^{n}\). Given that \(z^{k_{j}}\in \mathcal {O}_{T(y^{k_{j}-1})}\) and \(y^{k_{j}}\in \mathcal {O}_{S(z^{k_{j}})},\) we have \(z^{k_{j}}\in \partial f(\cdot ,y^{k_{j}-1})^{*}(y^{k_{j}-1})\) and \(y^{k_{j}}\in \partial f(z^{k_{j}},\cdot )^{*}(z^{k_{j}})\). Hence, using the properties of f and Corollary 4.1, we get

$$\begin{aligned} f(\cdot ,y)^{*}(y)\le & {} \lim _{j\rightarrow \infty }f\left( \cdot ,y^{k_{j}-1}\right) ^{*}\left( y^{k_{j}-1}\right) \nonumber \\= & {} \lim _{j\rightarrow \infty }\left( \left\langle z^{k_{j}},y^{k_{j}-1}\right\rangle -f\left( z^{k_{j}},y^{k_{j}-1}\right) \right) \nonumber \\\le & {} \lim _{j\rightarrow \infty }f\left( z^{k_{j}},\cdot \right) ^{*}\left( z^{k_{j}}\right) \nonumber \\= & {} f(\overline{z},\cdot )^{*}(\overline{z}) \nonumber \\= & {} \left\langle \overline{z},{y}\right\rangle -f(\overline{z},{y}) \end{aligned}$$
(12)

From (12) and Fenchel inequality, we obtain \({y}\in \partial f(\cdot ,{y})(\overline{z})\). This shows that \(\overline{z}\in \mathcal {P} _{l}\).

The proof of (2) is similar. \(\square \)