1 Introduction

In 2008, Bauschke et al. [2], first addressed the question of how to transform one convex function into another in a continuous manner. Given proper convex functions \(f_0\) and \(f_1\), their proposed solution, the proximal average, used Fenchel conjugates to define a parameterized function \(PA(x,\lambda )\) such that \(PA\) is epi-continuous with respect to \(\lambda ,\) and \(PA(x,0)=f_0(x), PA(x,1)=f_1(x)\) for all \(x.\) The proximal average has been studied extensively since its original conception, and many favourable properties and applications of this approach have arisen [1, 3, 4, 69, 11, 1315, 17, 18]. For example, the minimizers of the proximal average function change continuously with respect to \(\lambda \) [8].

The proximal average has also been generalized and reformulated in a number of useful manners. For example, in [1], the proximal average is generalized to a finite number of convex functions. In [5], the proximal average is generalized to allow for alternate kernels, which further allowed for applications with monotone operators. In [9], the proximal average is reformulated to apply to saddle functions. And, in [11], the proximal average was reformulated to work with two (nonconvex), proper, lsc, prox-bounded functions. This document generalizes the work done in [11] to allow for a finite number of such functions.

Given two proper, lsc, prox-bounded functions, \(f_0\) and \(f_1\), the NC-proximal average was originally defined as

$$\begin{aligned} PA_r(x,\lambda )\, {:=}\,-e_{r+\lambda (1-\lambda )}\left( -(1-\lambda )e_rf_0-\lambda e_rf_1\right) (x) \end{aligned}$$

where \(\lambda \in [0,1]\) and \(e_rf\) is the Moreau envelope of \(f\) using the prox-parameter \(r,\) defined as

$$\begin{aligned} e_rf(x)\, {:=}\, \inf \limits _y\left\{ f(y)+\frac{r}{2}|y-x|^2\right\} . \end{aligned}$$

Associated with the Moreau envelope, and closely related to the NC-proximal average, is the proximal point mapping \(P_r f\) defined as

$$\begin{aligned} P_rf(x):=\mathop {\mathrm{argmin}}\limits _y\left\{ f(y)+\frac{r}{2}|y-x|^2\right\} . \end{aligned}$$

In [11] the function \(PA_r\) is analyzed and a number of propositions and theorems are developed in order to describe its properties. Here, we extend those results for a finite number of proper, lsc, prox-bounded functions \(f_i, i\in \{1,2,\ldots ,m\}.\) We begin by defining the NC-proximal average as

$$\begin{aligned} \begin{aligned} PA_{r,\delta }(x,\lambda ) \,\,&{:=} -e_{r+\delta (\lambda )} \left( -\sum \limits _{i=1}^m\lambda _ie_rf_i\right) (x),\\ \lambda \in \varLambda \,\,&{:=} \left\{ (\lambda _1,\lambda _2,\ldots ,\lambda _m)\in \mathbb R ^m:\lambda _i\ge 0 \text{ for } \text{ all } i \text{ and } \sum \limits _{i=1}^m\lambda _i=1\right\} , \end{aligned} \end{aligned}$$
(1.1)

and \(\delta \) is any continuous function such that \(\delta (\lambda )=0\) if \(\lambda =e_i\) (the canonical unit vector whose \(i^\mathrm{th}\) component is 1) for some \(i,\) and \(\delta (\lambda )>0\) otherwise. This definition generalizes that of [11] in two respects. First, the original definition is restricted to outer prox-parameter \(r+\lambda (1-\lambda ),\) when in fact the \(\lambda (1-\lambda )\) term can be replaced by any function \(\delta \) as described above. Second, the results found in [11] are reworked in order to accommodate any finite number of functions.

Remark 1

It should be clear that the choice of the function \(\delta \) used in defining the NC-proximal average will have a great impact on the parameterized function \(PA_{r,\delta }\). However, it will become clear in this paper that the underlying properties of \(PA_{r,\delta }\) are in fact not effected by \(\delta \). As such, for ease of notation, except when necessary we shall simplify \(PA_{r,\delta }\) to \(PA_r\).

The remainder of this article is organized as follows. Section 2 provides definitions and shows that \(PA_r\) is well-defined. Section 3 explores the prox-regularity and para-prox-regularity aspects of the function, and Sect. 4 considers its stability. We conclude, in Sect. 5, with some discussion on the minimizers of the NC-proximal average, including an example that demonstrates that the minimizers of the NC-proximal average may be multi-valued and discontinuous.

2 Preliminaries

Throughout this paper, we use \(q\) to represent the norm-squared function, \(q(x)=|x|^2\). This section restates some definitions we need, and shows that under basic assumptions, \(PA_r\) is a well-defined function.

Definition 1

A proper function \(f:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) is said to be prox-bounded if there exist \(r>0\) and a point \(\bar{x}\) such that \(e_rf(\bar{x})>-\infty .\) The infimum of the set of all such \(r\) is called the threshold of prox-boundedness.

Definition 2

A function is lower-\(\mathcal C ^2\) on an open set \(V\) if it is finite-valued on \(V\) and at any point \(x\in V\) the function appended with a quadratic term is convex on some open convex neighborhood \(V^{\prime }\) of \(x.\) The function is said to be lower-\(\mathcal C ^2\) (with no mention of \(V\)) if \(V=\mathbb R ^n.\)

Our first task is to confirm that \(PA_r\) is a well-defined and well-behaved function. The following proposition generalizes [11, Prop 2.5].

Proposition 1

For \(i\in \{1,2,\ldots ,m\}\) let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, prox-bounded functions with respective thresholds \(\bar{r}_i\). Let \(r>\max \limits _i\{\bar{r}_i\}.\) Then for all \(\lambda \in \varLambda \), \(PA_r\) is a proper function in \(x\). Furthermore, if \(\lambda _i\ne 1\) for all \(i\), then \(PA_r\) defines a lower-\(\mathcal C ^2\) function in \(x\). Finally, if for some \(i\) one has that \(f_i+\frac{r}{2}q\) is convex, then \(PA_r(\cdot ,e_i)=f_i.\)

Proof

We know that \(-e_rf_i\) is well-defined for all \(i\), since \(r>\bar{r}_i\) for all \(i\). By [11, Lem 2.4], which is extendible to the case of \(m\) functions, we know that \(-\sum _{i=1}^m\lambda _i e_r f_i\) is a proper, lower-\(\mathcal C ^2\), prox-bounded function, with threshold \(\bar{r}\le \sum _{i=1}^m\lambda _ir=r\). Thus the Moreau envelope of \(-\sum _{i=1}^m\lambda _ie_rf_i\) is well-defined and proper whenever the prox-parameter is greater than or equal to \(r\) (as is the case when \(\lambda \in \varLambda \)), and it is lower-\(\mathcal C ^2\) whenever the prox-parameter is strictly greater than \(r\) (as is the case when \(\lambda \in \varLambda \) and \(\lambda _i\ne 1\) for all \(i\)). The last statement is proved by applying [16, Ex 11.26 (d)] to \(PA_r(x,e_i)=-e_r(-e_rf_i)(x)\).\(\square \)

3 Prox-regularity

In this section, we wish to establish the conditions under which the function \(\sum _{i=1}^m\lambda _i e_r f_i\) is para-prox-regular, so that in Sect. 4 we may explore the stability of \(PA_r.\) Let us recall what we mean by prox-regularity and para-prox-regularity of a function.

Definition 3

A proper function \(f\) is prox-regular at a point \(\bar{x}\) for \(\bar{v}\in \partial f(\bar{x})\) if \(f\) is locally lsc at \(\bar{x}\) and there exist \(\epsilon >0\) and \(r>0\) such that

$$\begin{aligned} f(x^{\prime })\ge f(x)+\langle v,x^{\prime }-x\rangle -\frac{r}{2}|x^{\prime }-x|^2 \end{aligned}$$
(3.1)

whenever \(x^{\prime }\ne x, |x^{\prime }-\bar{x}|<\epsilon , |x-\bar{x}|<\epsilon , |f(x)-f(\bar{x})|<\epsilon , v\in \partial f(x),\) and \(|v-\bar{v}|<\epsilon .\) We say the function is continuously prox-regular at \(\bar{x}\) for \(\bar{v}\) if, in addition, \(f\) is continuous as a function of \((x,v)\in \mathrm{gph}\,\partial f\) at \((\bar{x},\bar{v}).\) The function is said to be prox-regular at \(\bar{x}\) (with no mention of \(\bar{v}\)) if it is prox-regular at \(\bar{x}\) for every \(\bar{v}\in \partial f(\bar{x}),\) and simply prox-regular (with no mention of \(\bar{x}\)) if it is prox-regular at \(\bar{x}\) for every \(\bar{x}\in \mathrm{dom}\,f.\)

From a graphical point of view, a prox-regular function is one that is locally bounded below by quadratics of equal curvature. Para-prox-regularity is an extension of this idea that includes an extra parameter \(\lambda \).

Definition 4

A proper, lsc function \(f:\mathbb R ^n\times \mathbb R ^s\rightarrow \mathbb R \cup \{\infty \}\) is parametrically prox-regular in \(x\) at \(\bar{x}\) for \(\bar{v}\in \partial _xf(\bar{x},\bar{\lambda })\) with compatible parametrization by \(\lambda \) at \(\bar{\lambda }\) (also refered to as para-prox-regular in \(x\) at \((\bar{x},\bar{\lambda })\) for \(\bar{v}\)), with parameters \(\epsilon >0\) and \(r>0,\) if

$$\begin{aligned} f(x^{\prime },\lambda )\ge f(x,\lambda )+\langle v,x^{\prime }-x\rangle -\frac{r}{2}|x^{\prime }-x|^2 \end{aligned}$$
(3.2)

whenever \(x^{\prime }\ne x, |x^{\prime }-\bar{x}|<\epsilon , |x-\bar{x}|<\epsilon , |f(x,\lambda )-f(\bar{x},\bar{\lambda })|<\epsilon , |\lambda -\bar{\lambda }|<\epsilon , v\in \partial _xf(x,\lambda ),\) and \(|v-\bar{v}|<\epsilon .\) It is continuously para-prox-regular in \(x\) at \((\bar{x},\bar{\lambda })\) for \(\bar{v}\) if, in addition, \(f\) is continuous as a function of \((x,\lambda ,v)\in \mathrm{gph}\,\partial _xf\) at \((\bar{x},\bar{\lambda },\bar{v}).\) If the parameter \(\bar{\lambda },\) the subgradient \(\bar{v},\) or the point \(\bar{x}\) is omitted, then the para-prox-regularity of \(f\) is understood to mean for all \(\bar{\lambda }\in \mathrm{dom}\,f(\bar{x},\cdot ),\) for all \(\bar{v}\in \partial _xf(\bar{x},\bar{\lambda }),\) or for all \(\bar{x}\in \mathrm{dom}\,f(\cdot ,\bar{\lambda }),\) respectively.

Proposition 2

For \(i\in \{1,2,\ldots ,m\}\) let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, and prox-bounded with threshold \(r_i.\) Let \(r>r_i\) for all \(i.\) Define

$$\begin{aligned} F(x,\lambda )= {\left\{ \begin{array}{ll} -\sum _{i=1}^m\lambda _ie_rf_i(x), &{}\lambda \in \varLambda \\ \infty , &{} \lambda \not \in \varLambda . \end{array}\right. } \end{aligned}$$

Then \(F\) is continuously para-prox-regular at any \(\bar{x}\), with compatible parametrization by \(\lambda \) at any \(\bar{\lambda }\in \varLambda .\) Moreover, \(F\) is lower-\(\mathcal C ^2\) and strictly continuous, and if \((0,y)\in \partial ^\infty F(\bar{x},\bar{\lambda })\) then \(y=0.\)

Proof

Since \(f_i\) is proper, lsc and prox-bounded for all \(i,\) [16, Ex 10.32] gives us that \(-e_rf_i\) is lower-\(\mathcal C ^2\) for all \(i.\) The sum of lower-\(\mathcal C ^2\) functions is lower-\(\mathcal C ^2,\) and any lower-\(\mathcal C ^2\) function is strictly continuous [16, Thm 10.31], so \(F\) is lower-\(\mathcal C ^2\) and strictly continuous. Finally, [16, Thm 9.31] states that strict continuity of \(F\) at \((\bar{x},\bar{\lambda })\) is equivalent to \(\partial ^\infty F(\bar{x},\bar{\lambda })=\{0\},\) which gives us that \((0,y)\in \partial ^\infty F(\bar{x},\bar{\lambda })\Rightarrow y=0.\) This gives us all the conditions of [10, Thm 5.7], and its conclusion is the result we seek.\(\square \)

Remark 2

The proof of [11, Lemma 3.3] can also be adapted for a longer, but more direct proof of Proposition 2.

4 Stability

We are now ready to explore the stability of the NC-proximal average. By Proposition 2, we can see that \(PA_r\) is the Moreau envelope of a para-prox-regular function. This allows us to take advantage of the work done in [12], where the tilt stability and full stability of Moreau envelopes and proximal mappings of para-prox-regular functions was studied.

Theorem 1

[12, Thm 4.6] Let \(F:\mathbb R ^n\times \mathbb R ^s\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, and continuously para-prox-regular at \((\bar{x},\bar{\lambda })\) for \(\bar{v}\in \partial _xF(\bar{x},\bar{\lambda }),\) with parameters \(\epsilon \) and \(r.\) Assume further that \(F\) is prox-bounded with threshold \(\rho ,\) and that \(F\) satisfies the following:

  1. 1.

    \((0,y)\in \partial ^\infty F(\bar{x},\bar{\lambda })\Rightarrow y=0,\)

  2. 2.

    \((0,\lambda ^{\prime })\in D^*(\partial _xF)(\bar{x},\bar{\lambda }|\bar{v})(0)\Rightarrow \lambda ^{\prime }=0,\)

  3. 3.

    \((x^{\prime },\lambda ^{\prime })\in D^*(\partial _xF)(\bar{x},\bar{\lambda }|\bar{v})(v^{\prime }), v^{\prime }\ne 0\Rightarrow \langle x^{\prime },v^{\prime }\rangle >-\rho ^{\prime }|v^{\prime }|^2\) for some \(\rho ^{\prime }>0,\)

  4. 4.

    \(\partial _xF(\bar{x},\cdot )\) has a continuous selection \(g\) near \(\bar{\lambda },\) with \(g(\bar{\lambda })=\bar{v}.\)

If \(\bar{r} > \max \{\rho , \rho ^{\prime }, r\}\), then there exist \(K>0\) and a neighborhood \(\mathcal B =B_\delta (\bar{x}+\frac{\bar{v}}{r},\bar{\lambda },\bar{r})\) such that for all \((x,\lambda ,r),(x^{\prime },\lambda ^{\prime },r^{\prime })\in \mathcal B \) we have that \(P_rF_\lambda (x)\) and \(P_{r^{\prime }}F_{\lambda ^{\prime }}(x^{\prime })\) are single-valued, with

$$\begin{aligned} |P_rF_\lambda (x)-P_{r^{\prime }}F_{\lambda ^{\prime }}(x^{\prime })|\le K|r(x-\bar{x})-r^{\prime }(x^{\prime }-\bar{x}),\lambda -\lambda ^{\prime },r-r^{\prime })|, \end{aligned}$$

where \(F_\lambda (x)=F(x,\lambda ).\)

Lemma 1

[11, Lem 4.4] Suppose the function \(H:\mathbb R ^n\times \mathbb R ^s\rightarrow \mathbb R \cup \{\infty \}\) is finite, single-valued, and Lipschitz continuous in \((x,\lambda )\) near \((\bar{x},\bar{\lambda })\) with local Lipschitz constant \(\mathrm{Lip}\,H.\) Then

$$\begin{aligned} (0,\lambda ^{\prime })\in D^*H(\bar{x},\bar{\lambda }|H(\bar{x},\bar{\lambda }))(0)\Rightarrow \lambda ^{\prime }=0, \end{aligned}$$

and for \(\rho >\mathrm{Lip}\,H\) one has

$$\begin{aligned} (x^{\prime },\lambda ^{\prime })\in D^*H(\bar{x},\bar{\lambda }|H(\bar{x},\bar{\lambda }))(v^{\prime }), \quad v^{\prime }\ne 0\Rightarrow \langle x^{\prime },v^{\prime }\rangle >-\rho |v^{\prime }|^2. \end{aligned}$$

The next proposition is an analog of [11, Prop 4.5], rewritten to work with a finite number of functions. The proof of [11, Prop 4.5] is easily adaptable to this setting, so we present only the key details.

Proposition 3

For \(i\in \{1,2,\ldots ,m\}\), let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, and prox-bounded with threshold \(r_i.\) Let \(r>\max \limits _i\{r_i\},\) and define

$$\begin{aligned} F(x,\lambda ){:=}-\sum \limits _{i=1}^m\lambda _ie_rf_i(x). \end{aligned}$$

If \(P_rf_i\) is single-valued and Lipschitz continuous for all \(i,\) then the following three properties hold:

  1. 1.

    \((0,\lambda ^{\prime })\in D^*(\partial _xF)(\bar{x},\bar{\lambda }|\bar{v})(0)\Rightarrow \lambda ^{\prime }=0,\)

  2. 2.

    for some \(\rho >0\) we have \((x^{\prime },\lambda ^{\prime })\in D^*(\partial _xF(\bar{x},\bar{\lambda }|\bar{v})(v^{\prime }),v^{\prime }\ne 0\Rightarrow \langle x^{\prime },v^{\prime }\rangle >-\rho |v^{\prime }|^2,\) and

  3. 3.

    the set-valued mapping \(\partial _xF(\bar{x},\cdot )\) has a continuous selection \(g\) near \(\bar{\lambda }.\)

Proof

Since \(P_rf_i\) is Lipschitz continuous, we have that \(e_rf_i\in \mathcal C ^{1+}\) with \(\nabla e_rf_i=r(I-P_rf_i)\) [12, Thm 2.4]. Hence,

$$\begin{aligned} \partial _xF(\bar{x},\lambda )&=\nabla _x\left( -\sum \limits _{i=1}^m\lambda _ie_rf_i\right) (\bar{x},\lambda )\\&=r\left[ \left( \sum \limits _{i=1}^m\lambda _iP_rf_i(\bar{x})\right) -\bar{x}\right] \end{aligned}$$

which is linear in \(\lambda ,\) showing Property 3. Since \(P_rf_i\) is single-valued and Lipschitz continuous, we have \(\partial _xF(x,\lambda )\) single-valued and Lipschitz continuous. Properties 1 and 2 follow by applying Lemma 1.\(\square \)

Proposition 4

For \(i\in \{1,2,\ldots ,m\}\), let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, and prox-bounded with threshold \(r_i.\) Let \(r>\max \limits _i\{r_i\}.\) Then \(PA_r(\cdot ,\lambda )+\frac{r+\delta (\lambda )}{2}q(\cdot -\bar{x})\) is convex for any \(\bar{x}.\) Hence, \(PA_r(\cdot ,\lambda )\) is lower-\(\mathcal C ^2.\)

Proof

Define \(F_\lambda \, {:=}\, -\sum _{i=1}^m\lambda _ie_rf_i.\) Then

$$\begin{aligned} PA_r+\frac{r+\delta (\lambda )}{2}q =-e_{r+\delta (\lambda )}(F_\lambda )+\frac{r+\delta (\lambda )}{2}q. \end{aligned}$$

By [16, Ex 11.26], we have

$$\begin{aligned} -e_{r+\delta (\lambda )}(F_\lambda )+\frac{r+\delta (\lambda )}{2} q=\left( F_\lambda +\frac{r+\delta (\lambda )}{2}q\right) ^*((r+\delta (\lambda ))\cdot ), \end{aligned}$$

where \(f^*(x)\, {:=}\, \sup _y\{\langle x, y \rangle - f(y)\}\) is the Fenchel conjugate as defined in [2]. This is an affine function composed with a convex function (as conjugate functions are convex), and as such it is convex. Notice that shifting the argument of \(q\) by \(\bar{x}\) only results in the addition of a linear term, as

$$\begin{aligned} q(x-\bar{x})=q(x)+2\langle x,\bar{x}\rangle +q(\bar{x}) \end{aligned}$$

where \(q(\bar{x})\) is constant and \(2\langle x,\bar{x}\rangle \) is linear. Hence, \(PA_r+q(\cdot -\bar{x})\) is convex.\(\square \)

Theorem 2

(Stability of \(PA_r\)) For \(i\in \{1,2,\ldots ,m\}\), let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, and prox-bounded with threshold \(r_i.\) Let \(\bar{r}>\max \limits _i\{r_i\}\) and \(\bar{r} > \rho ^{\prime }\) from Theorem 1 Condition 3. Suppose that for all \(i, P_{\bar{r}}f_i\) is single-valued and Lipschitz continuous (as is the case when \(f_i\) is prox-regular). Then \(PA_{\bar{r}}\) is well-defined and lower-\(\mathcal C ^2.\) If in addition

$$\begin{aligned} \mathrm{Lip}\,\left( \sum \limits _{i=1}^m\lambda _iP_{\bar{r}}f_i-I\right) \le 1, \end{aligned}$$
(4.1)

then for any \(\bar{\lambda }\) such that \(\delta (\bar{\lambda })>0\) we have

  1. 1.

    \(PA_{\bar{r}}(\cdot ,\bar{\lambda })\in \mathcal C ^{1+}\) as a function of \(x\)

  2. 2.

    \(PA_{\bar{r}}\) is locally Lipschitz continuous in \(\lambda \) near \(\bar{\lambda }\)

  3. 3.

    \(\nabla _xPA_{\bar{r}}\) is locally Lipschitz continuous in \(\lambda \) near \(\bar{\lambda }.\)

Finally, if \(f_i+\frac{\bar{r}}{2}q\) is convex then \(PA_{\bar{r}}(\cdot ,e_i)=f_i(\cdot ).\)

Proof

Let \(F(x,\lambda )=-\sum _{i=1}^m\lambda _ie_{\bar{r}}f_i(x).\) By Proposition 1, \(PA_{\bar{r}}\) is well-defined and finite-valued. Since \(P_{\bar{r}}f_i\) is single-valued for all \(i, P_{\bar{r}}F\) is single-valued as well. Since \(f_i\) is proper, lsc, and prox-bounded for all \(i,\) and \(\bar{r}\) is greater than each threshold \(r_i,\) Proposition 2 gives us that \(F\) is continuously para-prox-regular at \((\bar{x},\bar{\lambda })\) for \(\bar{v}\in \partial _xF(\bar{x},\bar{\lambda }),\) and that \((0,y)\in \partial ^\infty F(\bar{x},\bar{\lambda })\Rightarrow y=0.\) Since \(P_{\bar{r}}f_i\) is single-valued and Lipschitz continuous for all \(i,\) we have all the conditions of [11, Prop 4.5], and therefore

  1. 1.

    \((0,\lambda ^{\prime })\in D^*(\partial _xF)(\bar{x},\bar{\lambda }|\bar{v})(0)\Rightarrow \lambda ^{\prime }=0\)

  2. 2.

    \((x^{\prime },\lambda ^{\prime })\in D^*(\partial _xF)(\bar{x},\bar{\lambda }|\bar{v})(v^{\prime }), v^{\prime }\ne 0\Rightarrow \langle x^{\prime },v^{\prime }\rangle >-\rho |v^{\prime }|^2\) for some \(\rho >0\)

  3. 3.

    The mapping \(\partial _xF(\bar{x},\cdot )\) has a continuous selection \(g\) near \(\bar{\lambda }.\)

Hence the condition \(\bar{r}>\max \{\rho ,\rho ^{\prime },r\}\) of Theorem 1 is satisfied (recall \(r=\max _i\{r_i\}\)). Therefore, all conditions of Theorem 1 hold, and we may assume its result. Since \(\delta \in \mathcal C ^2,\) there exists \(\bar{K}>0\) such that

$$\begin{aligned} |\delta (\lambda ^{\prime })-\delta (\lambda )|\le \bar{K}|\lambda ^{\prime }-\lambda | \end{aligned}$$

for all \(\lambda ^{\prime },\lambda \) near \(\bar{\lambda }.\) The rest of the proof is the same as that of [11, Thm 4.6].\(\square \)

Corollary 1

For \(i\in \{1,2,\ldots ,m\}\), let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper and lsc such that for some \(r>0, f_i+\frac{r}{2}q\) is convex for all \(i.\) Then \(f_i\) is prox-regular and prox-bounded, and inequality (4.1) holds. In particular, all the conditions of Theorem 2 hold.

Proof

Since \(f_i+\frac{r}{2}q\) is convex for all \(i,\) we have that \(f_i\) is prox-bounded and lower-\(\mathcal C ^2\), and therefore prox-regular, for all \(i.\) Since

$$\begin{aligned} P_1 \left( f_i+\frac{r}{2}q\right) =P_{r+1}f_i, \end{aligned}$$

by [16, Prop 12.19] we have that \(I-P_1(f_i+\frac{r}{2}q)\) is Lipschitz continuous with constant at most 1. Thus

$$\begin{aligned} \mathrm{Lip}\,\left\{ \sum \limits _{i=1}^m\lambda _iP_{r+1}f_i-I\right\} =\mathrm{Lip}\,\left\{ \sum \limits _{i=1}^m\lambda _i(I-P_{r+1}f_i)\right\} \le \sum \limits _{i=1}^m\lambda _1=1. \end{aligned}$$

This provides inequality (4.1).\(\square \)

5 Example

In 2010, Goebel et al. presented a study of the minimizers of the proximal average function for convex functions. For convex functions \(f_i\) recall that \(-e_r \left( -\sum _{i=1}^m \lambda _i e_r f_i\right) (x)\) defined the proximal average from [2]. It was shown that

$$\begin{aligned} \varPhi (\lambda )\, {:=}\, \mathop {\mathrm{argmin}}\limits _x -e_r \left( -\sum _{i=1}^m \lambda _i e_r f_i\right) (x) \end{aligned}$$

is single-valued and continuous, provided that all functions are bounded below and at least one function is essentially strictly convex [8, Thm 3.8]. We next show that if \(f_i\) are convex functions, then the minimizers of the NC-proximal average coincide exactly with the minimizers of the proximal average. In particular, in this case all results from [8] hold.

Lemma 2

For \(i\in \{1,2,\ldots ,m\}\) let \(f_i:\mathbb R ^n\rightarrow \mathbb R \cup \{\infty \}\) be proper, lsc, convex, and bounded below. Let \(\lambda \in \varLambda \), then

$$\begin{aligned} \mathop {\mathrm{argmin}}\limits _x PA_r(x,\lambda ) = \mathop {\mathrm{argmin}}\limits _x \sum _{i=1}^m \lambda _i e_r f_i(x) = \mathop {\mathrm{argmin}}\limits _x -e_r \left( -\sum _{i=1}^m \lambda _i e_r f_i\right) (x). \end{aligned}$$

Proof

The minimizers of \(PA_r(\cdot ,\lambda )\) coincide with the minimizers of its Moreau envelope \(e_{r+\delta (\lambda )} PA_r(\cdot ,\lambda )\). By [16, Ex 11.26(d)], we have that \(-e_{r+\delta (\lambda )} PA_r(x,\lambda ) = \left( \sum _{i=1}^m -\lambda _i e_r f_i(x)\right) \), so the first equality holds. The second equality appears in [8, Lem 3.2]. \(\square \)

If \(f_i\) are non-convex, then the proximal average is undefined, and the results from [8] no longer apply. In this case, the results of Theorem 2 provide some small understanding of the continuity of the minimizers of the NC-proximal average, as follows.

Corollary 2

Let the conditions of Theorem 2 hold. Let \(x_k\in \mathop {\mathrm{argmin}}\nolimits _xPA_r(x,\lambda _k).\) Suppose \(\lambda _k\rightarrow \bar{\lambda }\) and \(x_k\rightarrow \bar{x}.\) Then \(\nabla PA_r(\bar{x},\bar{\lambda })=0.\)

Proof

By Theorem 2, \(\nabla PA_r\) is Lipschitz continuous in \(\lambda .\) Therefore, there exists \(c>0\) such that for all \(k,\)

$$\begin{aligned} |\nabla PA_r(x_k,\lambda _k)-\nabla PA_r(x_k,\bar{\lambda })\le c|\lambda _k-\bar{\lambda }|. \end{aligned}$$

Since \(x_k\in \mathrm{argmin}PA_r(x_k,\lambda _k),\) we know that \(\nabla PA_r(x_k,\lambda _k)=0.\) So for all \(k,\)

$$\begin{aligned} |\nabla PA_r(x_k,\bar{\lambda })|\le c|\lambda _k-\bar{\lambda }|. \end{aligned}$$

Taking the limit as \(k\rightarrow \infty ,\) we find that \(\nabla PA_r(\bar{x},\bar{\lambda })=0\).\(\square \)

While Corollary 2 gives us a way to identify the minimizers of \(PA_r,\) it says nothing about the single-valuedness or the continuity of said minimizers. The example that follows illustrates that, in fact, the function of minimizers of the NC-proximal average may be multi-valued and discontinuous.

Let \(\epsilon =\frac{1}{2},\) and define the functions \(g_0\) and \(g_1\) via

$$\begin{aligned}&g_0(x)\, {:=}\, \max \left\{ -x,-\frac{1}{2}(x-1)^2+\frac{1}{2},x-2+\epsilon \right\} ,\\&g_1(x)\, {:=}\, \max \left\{ -x+\epsilon ,-\frac{1}{2}(x-1)^2+\frac{1}{2},x-2\right\} . \end{aligned}$$

Then \(g_0\) and \(g_1\) are proper, lsc, and bounded below (see Fig. 1).

Fig. 1
figure 1

Functions \(g_0\) and \(g_1\) for \(\epsilon =0.5\)

Moreover, \(g_i+\frac{1}{2}q\) is convex for \(i\in \{1,2\}\). Let \(k=2-\sqrt{4-2\epsilon }, l=\sqrt{4-2\epsilon }\) and define

$$\begin{aligned} \begin{array}{llll} \delta _0 \,{:=}\, 0 &{}\quad \delta _1\,{:=}\,\epsilon &{}\quad \epsilon _0\,{:=}\,\epsilon &{}\quad \epsilon _1\,{:=}\, 0\\ k_0\,{:=}\, 0 &{}\quad k_1\,{:=}\, k &{}\quad l_0\,{:=}\, l &{}\quad l_1\,{:=}\, 2. \end{array} \end{aligned}$$

Consider \(P_rg_i(\bar{x})=\mathop {\mathrm{argmin}}\nolimits _x\{g_i(x)+\frac{r}{2}|x-\bar{x}|^2\}.\) If \(r > 1\), then we find that

$$\begin{aligned} P_r g_i(\bar{x})= {\left\{ \begin{array}{ll} \bar{x}+\frac{1}{r}, &{} \bar{x}<k_i-\frac{1}{r}\\ k_i, &{} \bar{x}\in [k_i-\frac{1}{r},k_i-\frac{k_i}{r}+\frac{1}{r}]\\ \frac{r\bar{x}-1}{r-1}, &{} \bar{x}\in (k_i-\frac{k_i}{r}+\frac{1}{r},l_i-\frac{l_i}{r}+\frac{1}{r})\\ l_i, &{} \bar{x}\in [l_i-\frac{l_i}{r}+\frac{1}{r},l_i+\frac{1}{r}]\\ \bar{x}-\frac{1}{r}, &{} \bar{x}>l_i+\frac{1}{r}. \end{array}\right. } \end{aligned}$$

Evaluating the Moreau envelope and simplifying, we get

$$\begin{aligned} e_rg_i(\bar{x})= {\left\{ \begin{array}{ll} -\bar{x}-\frac{1}{2r}+\delta _i, &{} \bar{x}<k_i-\frac{1}{r}\\ \frac{r}{2}\bar{x}^2-rk_i\bar{x}+\frac{r-1}{2}k_i^2+k_i, &{} \bar{x}\in [k_i-\frac{1}{r},k_i-\frac{k_i}{r}+\frac{1}{r}]\\ -\frac{1}{2(r-1)}(r\bar{x}^2-2r\bar{x}+1), &{} \bar{x}\in (k_i-\frac{k_i}{r}+\frac{1}{r},l_i-\frac{l_i}{r}+\frac{1}{r})\\ \frac{r}{2}\bar{x}^2-rl_i\bar{x}+\frac{r-1}{2}l_i^2+l_i, &{} \bar{x}\in [l_i-\frac{l_i}{r}+\frac{1}{r},l_i+\frac{1}{r}]\\ \bar{x}-2-\frac{1}{2r}+\epsilon _i, &{} \bar{x}>l_i+\frac{1}{r}. \end{array}\right. } \end{aligned}$$

Considering the specific example \(r=2\), and applying \(\epsilon =\frac{1}{2}\), we define the function \(G(\bar{x},\lambda )\,{:=}\, (\lambda e_2 g_0+(1-\lambda )e_2 g_1)(\bar{x}),\) which can be expanded to

$$\begin{aligned} G(\bar{x},\lambda )= {\left\{ \begin{array}{ll} -\bar{x}-\frac{\lambda }{2}+\frac{1}{4}, &{}\qquad x<-\frac{1}{2}\\ \lambda \bar{x}^2+(\lambda -1)\bar{x}-\frac{\lambda -1}{4}, &{}\qquad x\in [-\frac{1}{2},\frac{3-2\sqrt{3}}{2})\\ \bar{x}^2+(\lambda -1)(4-2\sqrt{3})\bar{x}-\frac{(\lambda -1)(11-6\sqrt{3})}{2}, &{}\qquad x\in [\frac{3-2\sqrt{3}}{2},\frac{1}{2}]\\ (1-2\lambda )\bar{x}^2+[-4+2\sqrt{3}+(6-2\sqrt{3})\lambda ]\bar{x} &{}\\ \quad +\frac{11-6\sqrt{3}}{2}-(6-3\sqrt{3})\lambda , &{}\qquad x\in (\frac{1}{2},\frac{3-\sqrt{3}}{2}]\\ -\bar{x}^2+2\bar{x}-\frac{1}{2}, &{}\qquad x\in (\frac{3-\sqrt{3}}{2},\frac{1+\sqrt{3}}{2})\\ (2\lambda -1)\bar{x}^2+[2-(2+2\sqrt{3})\lambda ]\bar{x}-\frac{1}{2}+(2+\sqrt{3})\lambda , &{}\qquad x\in [\frac{1+\sqrt{3}}{2},\frac{3}{2})\\ \bar{x}^2-[4-(4-2\sqrt{3})\lambda ]\bar{x}+4-\frac{5-2\sqrt{3}}{2}\lambda , &{}\qquad x\in [\frac{3}{2},\frac{1+2\sqrt{3}}{2}]\\ (1-\lambda )\bar{x}^2+(5\lambda -4)\bar{x}+4-\frac{23}{4}\lambda , &{}\qquad x\in (\frac{1+2\sqrt{3}}{2},\frac{5}{2}]\\ \bar{x}+\frac{\lambda }{2}-\frac{9}{4}, &{}\qquad x<\frac{5}{2}. \end{array}\right. } \end{aligned}$$

By Lemma 2, we know that

$$\begin{aligned} \mathop {\mathrm{argmin}}\limits _{\bar{x}}PA_r(\bar{x},\lambda ) =\mathop {\mathrm{argmin}}\limits _{\bar{x}}G(\bar{x},\lambda ). \end{aligned}$$

Figure 2 displays graphs of \(G\) for various values of \(\lambda .\)

Fig. 2
figure 2

\(G(\bar{x},\lambda )\)

Noting that \(G\in \mathcal C ^1,\) we find three critical points (where \(\frac{\partial }{\partial x}G(x, \lambda )=0\)):

  1. 1.

    \(\bar{x}_1=(1-\lambda )(2-\sqrt{3})\) (leftmost local minimum argument),

  2. 2.

    \(\bar{x}_2=1\) (local maximum argument),

  3. 3.

    \(\bar{x}_3=2-(2-\sqrt{3})\lambda \) (rightmost local minimum argument).

Observe that when \(\lambda =\frac{1}{2}\) we have that \(\bar{x}_1=\frac{2-\sqrt{3}}{2}\), \(\bar{x}_3=\frac{2+\sqrt{3}}{2},\) and

$$\begin{aligned} G \left( \frac{2-\sqrt{3}}{2},\frac{1}{2}\right) =\frac{2-\sqrt{3}}{2} =G \left( \frac{2+\sqrt{3}}{2},\frac{1}{2}\right) . \end{aligned}$$

This verifies that there are two minimizers when \(\lambda =\frac{1}{2}.\) Finally, we note that

$$\begin{aligned} G(\bar{x}_1,\lambda )<G(\bar{x}_3,\lambda ), \lambda \in \bigg [0,\frac{1}{2}\bigg ) \quad \text{ and }\quad G(\bar{x}_1,\lambda )>G(\bar{x}_3,\lambda ),~\lambda \in \bigg (\frac{1}{2},1\bigg ], \end{aligned}$$

which proves the argmin is a singleton whenever \(\lambda \ne \frac{1}{2}.\) Therefore, \(\mathrm{argmin}PA_r\) is not a continuous function of \(\lambda .\)

6 Conclusion

We have seen that, using the Moreau envelope definition, the NC-proximal average can be generalized to accomodate any finite number of suitable functions. Under appropriate conditions, \(PA_r\) is well-defined, lower-\(\mathcal C ^2,\) and locally Lipschitz continuous in \(x\) and in \(\lambda .\) These properties make \(PA_r\) a useful function for researchers in the Optimization field.