Abstract
In discrete convex analysis, the scaling and proximity properties for the class of L\(^{\natural }\)-convex functions were established more than a decade ago and have been used to design efficient minimization algorithms. For the larger class of integrally convex functions of n variables, we show here that the scaling property only holds when \(n \le 2\), while a proximity theorem can be established for any n, but only with a superexponential bound. This is, however, sufficient to extend the classical logarithmic complexity result for minimizing a discrete convex function of one variable to the case of integrally convex functions of any fixed number of variables.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The proximity-scaling approach is a fundamental technique in designing efficient algorithms for discrete or combinatorial optimization. For a function \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) in integer variables and a positive integer \(\alpha \), called a scaling unit, the \(\alpha \)-scaling of f means the function \(f^{\alpha }\) defined by \(f^{\alpha }(x) = f(\alpha x) \)\((x \in {\mathbb {Z}}^{n})\). A proximity theorem is a result guaranteeing that a (local) minimum of the scaled function \(f^{\alpha }\) is close to a minimizer of the original function f. The scaled function \(f^{\alpha }\) is simpler and hence easier to minimize, whereas the quality of the obtained minimizer of \(f^{\alpha }\) as an approximation to the minimizer of f is guaranteed by a proximity theorem. The proximity-scaling approach consists in applying this idea for a decreasing sequence of \(\alpha \), often by halving the scaling unit \(\alpha \). A generic form of a proximity-scaling algorithm may be described as follows, where \(K_{\infty }\ (> 0)\) denotes the \(\ell _{\infty }\)-size of the effective domain \(\mathrm{dom\,}f = \{ x \in {{\mathbb {Z}}}^{n} \mid f(x) < +\,\infty \}\) and \(B(n,\alpha )\) denotes the proximity bound in \(\ell _{\infty }\)-distance for \(f^{\alpha }\).
The algorithm consists of \(O(\log _{2} K_{\infty })\) scaling phases. This approach has been particularly successful for resource allocation problems [8,9,10, 16] and for convex network flow problems (under the name of “capacity scaling”) [1, 14, 15]. Different types of proximity theorems have also been investigated: proximity between integral and real optimal solutions [9, 31, 32], among others. For other types of algorithms of nonlinear integer optimization, see, e.g., [5].
In discrete convex analysis [22,23,24,25], a variety of discrete convex functions are considered. A separable convex function is a function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) that can be represented as \(f(x) = \varphi _{1}(x_{1}) + \cdots + \varphi _{n}(x_{n})\), where \(x=(x_{1}, \ldots ,x_{n})\), with univariate discrete convex functions \(\varphi _{i}: {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) satisfying \(\varphi _{i}(t-1) + \varphi _{i}(t+1) \ge 2 \varphi _{i}(t)\) for all \(t \in {{\mathbb {Z}}}\).
A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called integrally convex if its local convex extension \({\tilde{f}}: {{\mathbb {R}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is (globally) convex in the ordinary sense, where \({\tilde{f}}\) is defined as the collection of convex extensions of f in each unit hypercube \(\{ x \in {{\mathbb {R}}}^{n} \mid a_{i} \le x_{i} \le a_{i} + 1 \ (i=1,\ldots , n) \}\) with \(a \in {{\mathbb {Z}}}^{n}\); see Sect. 2 for precise statements.
A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called L\(^{\natural }\)-convex if it satisfies one of the equivalent conditions in Theorem 1.1 below. For \(x,y \in {{\mathbb {Z}}}^{n}\), \(x \vee y\) and \(x \wedge y\) denote the vectors of component wise maximum and minimum of x and y, respectively. Discrete midpoint convexity of f for \(x,y \in {{\mathbb {Z}}}^{n}\) means
where \(\lceil \cdot \rceil \) and \(\lfloor \cdot \rfloor \) denote the integer vectors obtained by componentwise rounding-up and rounding-down to the nearest integers, respectively. We use the notation \(\varvec{1}=(1,1,\ldots , 1)\) and \(\varvec{1}_{i}\) for the i-th unit vector \((0,\ldots ,0, \overset{\overset{i}{\vee }}{1},0,\ldots ,0)\), with the convention \(\varvec{1}_{0}=\varvec{0}\).
Theorem 1.1
([2, 4, 23]) For \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) the following conditions, (a) to (d), are equivalentFootnote 1:
-
(a)
f is integrally convex and submodular:
$$\begin{aligned} f(x) + f(y) \ge f(x \vee y) + f(x \wedge y) \qquad (x, y \in {{\mathbb {Z}}}^{n}). \end{aligned}$$(1.2) -
(b)
f satisfies discrete midpoint convexity (1.1) for all \( x, y \in {{\mathbb {Z}}}^{n}\).
-
(c)
f satisfies discrete midpoint convexity (1.1) for all \( x, y \in {{\mathbb {Z}}}^{n}\) with \(\Vert x-y \Vert _{\infty } \le 2\), and the effective domain has the property: \(x, y \in \mathrm{dom\,}f \Rightarrow \left\lceil (x+y)/2 \right\rceil , \left\lfloor (x+y)/2 \right\rfloor \in \mathrm{dom\,}f \).
-
(d)
f satisfies translation-submodularity:
$$\begin{aligned} f(x) + f(y)\ge f((x - \mu \mathbf{1}) \vee y) + f(x \wedge (y + \mu \mathbf{1}))\qquad (x, y \in {{\mathbb {Z}}}^{n}, \ 0 \le \mu \in {{\mathbb {Z}}}). \nonumber \\ \end{aligned}$$(1.3)
A simple example to illustrate the difference between integrally convex and L\(^{\natural }\)-convex functions can be provided in the case of quadratic functions. Indeed, for an \(n \times n\) symmetric matrix Q and a vector \(p \in {{\mathbb {R}}}^n\), the function \(f(x) = x^{\top } Q x + p^{\top } x\) is integrally convex whenever Q is diagonally dominant with nonnegative diagonal elements, i.e., \(q_{ii} \ge \sum _{j \not = i} |q_{ij}|\) for \(i=1,\ldots ,n\) [2]. On the other hand, f is L\(^{\natural }\)-convex if and only if it is diagonally dominant with nonnegative diagonal elements and \(q_{ij} \le 0\) for all \(i \ne j\) [23, Section 7.3].
A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called M\(^{\natural }\)-convex if it satisfies an exchange property: For any \(x, y \in \mathrm{dom\,}f\) and any \(i \in \mathrm{supp}^{+}(x-y)\), there exists \(j \in \mathrm{supp}^{-}(x-y) \cup \{ 0 \}\) such that
where, for \(z \in {{\mathbb {Z}}}^{n}\), \(\mathrm{supp}^{+}(z) = \{ i \mid z_{i} > 0 \}\) and \(\mathrm{supp}^{-}(z) = \{ j \mid z_{j} < 0 \}\). It is known (and easy to see) that a function is separable convex if and only if it is both \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex.
Integrally convex functions constitute a common framework for discrete convex functions, including separable convex, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions as well as \(\mathrm{L}^{\natural }_{2}\)-convex and \(\mathrm{M}^{\natural }_{2}\)-convex functions [23], and BS-convex and UJ-convex functions [3]. The concept of integral convexity is used in formulating discrete fixed point theorems [11, 12, 35], and designing solution algorithms for discrete systems of nonlinear equations [17, 34]. In game theory the integral concavity of payoff functions guarantees the existence of a pure strategy equilibrium in finite symmetric games [13].
The scaling operation preserves \(\mathrm{L}^{\natural }\)-convexity, that is, if f is \(\mathrm{L}^{\natural }\)-convex, then \(f^{\alpha }\) is \(\mathrm{L}^{\natural }\)-convex. \(\mathrm{M}^{\natural }\)-convexity is subtle in this respect: for an \(\mathrm{M}^{\natural }\)-convex function f, \(f^{\alpha }\) remains \(\mathrm{M}^{\natural }\)-convex if \(n \le 2\), while this is not always the case if \(n\ge 3\).
Example 1.1
Here is an example to show that \(\mathrm{M}^{\natural }\)-convexity is not preserved under scaling. Let f be the indicator function of the set \(S = \{ c_{1} (1,0,-1) +c_{2} (1,0,0) + c_{3} (0,1,-1) + c_{4} (0,1,0) \mid c_{i} \in \{ 0,1 \} \ (i=1,2,3,4) \} \subseteq {{\mathbb {Z}}}^{3}\). Then f is an \(\mathrm{M}^{\natural }\)-convex function, but \(f^{2}\) (=\(f^{\alpha }\) with \(\alpha =2\)), being the indicator function of \( \{ (0,0,0), (1,1,-1) \}\), is not \(\mathrm{M}^{\natural }\)-convex. This example is a reformulation of [23, Note 6.18] for M-convex functions to \(\mathrm{M}^{\natural }\)-convex functions.
It is rather surprising that nothing is known about scaling for integrally convex functions. Example 1.1 does not demonstrate the lack of scaling property of integrally convex functions, since \(f^{2}\) above is integrally convex, though not \(\mathrm{M}^{\natural }\)-convex.
As for proximity theorems, the following facts are known for separable convex, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions. In the following three theorems we assume that \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\), \(\alpha \) is a positive integer, and \(x^{\alpha } \in \mathrm{dom\,}f\). It is noteworthy that the proximity bound is independent of n for separable convex functions, and coincides with \(n (\alpha -1)\), which is linear in n, for \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions.
Theorem 1.2
Suppose that f is a separable convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ \varvec{1}_{i}, -\varvec{1}_{i} \ (1 \le i \le n) \} \), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le \alpha -1\).
Proof
The statement is obviously true if \(n=1\). Then the statement for general n follows easily from the fact that \(x^{*}\) is a minimizer of \(f(x) = \varphi _{1}(x_{1}) + \cdots + \varphi _{n}(x_{n})\) if and only if, for each i, \(x^{*}_{i}\) is a minimizer of \(\varphi _{i}\). \(\square \)
Theorem 1.3
([15]; [23, Theorem 7.18]) Suppose that f is an \(\mathrm{L}^{\natural }\)-convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ 0, 1 \}^{n} \cup \{ 0, -1 \}^{n}\), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le n (\alpha -1)\).
Theorem 1.4
([18]; [23, Theorem 6.37]) Suppose that f is an \(\mathrm{M}^{\natural }\)-convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ \varvec{1}_{i}, -\varvec{1}_{i} \ (1 \le i \le n), \ \varvec{1}_{i} - \varvec{1}_{j} \ (i \not = j) \} \), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le n (\alpha -1)\).
Based on the above results and their variants, efficient algorithms for minimizing \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions have been successfully designed with the proximity-scaling approach [18, 20, 21, 23, 30, 33]. Proximity theorems are also available for \(\mathrm{L}^{\natural }_{2}\)-convex and \(\mathrm{M}^{\natural }_{2}\)-convex functions [27] and L-convex functions on graphs [6, 7]. However, no proximity theorem has yet been proved for integrally convex functions.
The new findings of this paper are
-
A “box-barrier property” (Theorem 2.6), which allows us to restrict the search for a global minimum of an integrally convex function;
-
Stability of integral convexity under scaling when \(n = 2\) (Theorem 3.2), and an example to demonstrate its failure when \(n \ge 3\) (Example 3.1);
-
A proximity theorem with a superexponential bound \(\displaystyle [(n+1)!/ 2^{n-1}](\alpha -1)\) for all n (Theorem 5.1), and the impossibility of finding a proximity bound of the form \(B(n)(\alpha -1)\) where B(n) is linear or smaller than quadratic (Examples 4.4 and 4.5).
As a consequence of our proximity and scaling results, we derive that:
-
When n is fixed, an integrally convex function can be minimized in \(O(\log _{2} K_{\infty })\) time by standard proximity-scaling algorithms, where \(K_{\infty } = \max \{ \Vert x -y \Vert _{\infty } \mid x, y \in \mathrm{dom\,}f \}\) denotes the \(\ell _{\infty }\)-size of \(\mathrm{dom\,}f\).
This paper is organized as follows. In Sect. 2 the concept of integrally convex functions is reviewed with some new observations and, in Sect. 3, their scaling property is clarified. After a preliminary discussion in Sect. 4, a proximity theorem for integrally convex functions is established in Sect. 5. Algorithmic implications of the proximity-scaling results are discussed in Sect. 6 and concluding remarks are made in Sect. 7.
2 Integrally convex sets and functions
For \(x \in {{\mathbb {R}}}^{n}\) the integer neighborhood of x is defined as
For a function \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) the local convex extension \({\tilde{f}}: {{\mathbb {R}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) of f is defined as the union of all convex envelopes of f on N(x) as follows:
where \(\Lambda (x)\) denotes the set of coefficients for convex combinations indexed by N(x):
If \({{\tilde{f}}}\) is convex on \({{\mathbb {R}}}^{n}\), then f is said to be integrally convex [2]. A set \(S \subseteq {{\mathbb {Z}}}^{n}\) is said to be integrally convex if the convex hull \({\overline{S}}\) of S coincides with the union of the convex hulls of \(S \cap N(x)\) over \(x \in {{\mathbb {R}}}^{n}\), i.e., if, for any \(x \in {{\mathbb {R}}}^{n}\), \(x \in {\overline{S}} \) implies \(x \in \overline{S \cap N(x)}\). A set \(S \subseteq {{\mathbb {Z}}}^{n}\) is integrally convex if and only if its indicator function is an integrally convex function. The effective domain and the set of minimizers of an integrally convex function are both integrally convex [23, Proposition 3.28]; in particular, the effective domain and the set of minimizers of an L\(^{\natural }\)- or M\(^{\natural }\)-convex function are integrally convex.
For \(n=2\), integrally convex sets are illustrated in Fig. 1 and their structure is described in the next proposition.
Proposition 2.1
A set \(S \subseteq {\mathbb {Z}}^{2}\) is an integrally convex set if and only if it can be represented as \(S = \{ (x_{1},x_{2}) \in {\mathbb {Z}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) for some \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m)\).
Proof
Consider the convex hull \({\overline{S}}\) of S, and denote the (shifted) unit square \(\{ (x_{1}, x_{2}) \in {{\mathbb {R}}}^{2} \mid a_{i} \le x_{i} \le a_{i}+1 \ (i=1,2) \}\) by \(I(a_{1}, a_{2})\), where \((a_{1}, a_{2}) \in {{\mathbb {Z}}}^{2}\). Let S be an integrally convex set. It follows from the definition that \({\overline{S}} \cap I(a_{1}, a_{2}) = \overline{S \cap I(a_{1}, a_{2})}\) for each \((a_{1}, a_{2}) \in {{\mathbb {Z}}}^{2}\). Obviously, \(\overline{S \cap I(a_{1}, a_{2})}\) can be described by (at most four) inequalities \(p'_{j} x_{1} + q'_{j} x_{2} \le r'_{j} \ (j=1,\ldots ,\ell ')\) with \(p'_{j}, q'_{j} \in \{ -1,0,+1 \}\) and \(r'_{j} \in {{\mathbb {Z}}}\)\((j=1,\ldots ,\ell ')\), where \(\ell ' = \ell '(a_{1}, a_{2}) \le 4\). Since \({\overline{S}}\) is the union of sets \({\overline{S}} \cap I(a_{1}, a_{2})\), \({\overline{S}}\) can be represented as \(\{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) by a subfamily of the inequalities used for all \(\overline{S \cap I(a_{1}, a_{2})}\). Then we have \(S = \{ (x_{1},x_{2}) \in {\mathbb {Z}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\). Conversely, integral convexity of set S represented in this form for any \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\) is an easy consequence of the simple shape of the (possibly unbounded) polygon \( \{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\), which has at most eight edges having directions parallel to one of the vectors (1, 0), (0, 1), (1, 1), \((1,-1)\). \(\square \)
We note that in the special case where all inequalities \(p_{i} x_{1} + q_{i} x_{2} \le r_{i}\)\((i=1,\ldots ,m)\) defining S in Proposition 2.1 satisfy the additional property \(p_{i} q_{i} \le 0\), the set S is actually an \(\mathrm{L}^{\natural }\)-convex set [23, Section 5.5], which is a special type of sublattice [28].
Remark 2.1
A subtle point in Proposition 2.1 is explained here. In Proposition 2.1 we do not mean that the system of inequalities for S describes the convex hull \({\overline{S}}\) of S. That is, it is not claimed that \({\overline{S}} = \{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) holds. For instance, \(S = \{ (0,0), (1,0) \}\) is an integrally convex set, which can be represented as the set of integer points satisfying the four inequalities: \( -x_{1} + x_{2} \le 0\), \( x_{1} - x_{2} \le 1\), \( x_{1} + x_{2} \le 1\), and \( -x_{1} - x_{2} \le 0\). These inequalities, however, do not describe the convex hull \({\overline{S}}\), which is the line segment connecting (0, 0) and (1, 0). Nevertheless, it is true in general (cf. the proof of Proposition 2.1) that the convex hull of an integrally convex set can be described by inequalities of the form of \(p'_{i} x_{1} + q'_{i} x_{2} \le r'_{i}\) with \(p'_{i}, q'_{i} \in \{ -1,0,+1 \}\) and \(r'_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m')\). For \(S = \{ (0,0), (1,0) \}\) we can describe \({\overline{S}}\) by adding two inequalities \(x_{2} \le 0\) and \(- \,x_{2} \le 0\) to the original system of four inequalities. The present form of Proposition 2.1, avoiding the convex hull, is convenient in the proof of Proposition 3.1.
Corollary 2.2
If a set \(S \subseteq {{\mathbb {Z}}}^2\) is integrally convex, then for all points \(x,y \in S\), the set
is contained in S.
Proof
Let S be represented as in Proposition 2.1 and let \(x,y \in S\). Then we clearly have \( \max \{p_{i} x_{1} + q_{i} x_{2}, p_{i} y_{1} + q_{i} y_{2} \} \le r_{i}\)\((i=1,\ldots ,m)\). The claim follows by observing that \(\max \{p_{i} x_{1} + q_{i} x_{2}, p_{i} y_{1} + q_{i} y_{2} \}\) coincides with one of \(\max \{x_{i}, y_{i} \}\), \(\max \{-\,x_{i}, -\,y_{i} \}\)\((i=1,2)\), \(\max \{x_{1}-x_{2}, y_{1}- y_{2} \}\), \(\max \{x_{1}+x_{2}, y_{1}+ y_{2} \}\), \(\max \{-\,x_{1}+x_{2}, -\,y_{1}+ y_{2} \}\), \(\max \{-\,x_{1}-x_{2}, -\,y_{1}-y_{2} \}\), according to the values of \(p_{i}, q_{i} \in \{ -1,0,+1 \}\). \(\square \)
Note that \(\mathrm{ICH}(x,y)\) is integrally convex by Proposition 2.1, and that, by the above corollary, any integrally convex set containing \(\{x,y \}\) must contain \(\mathrm{ICH}(x,y)\). Thus \(\mathrm{ICH}(x,y)\) is the smallest integrally convex set containing \(\{x,y \}\).
Integral convexity is preserved under the operations of origin shift, permutation of components, and componentwise (individual) sign inversion. For later reference we state these facts as a proposition.
Proposition 2.3
Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function.
-
(1)
For any \(z \in {{\mathbb {Z}}}^{n}\), \(f(z + x)\) is integrally convex in x.
-
(2)
For any permutation \(\sigma \) of \((1,2,\ldots ,n)\), \(f(x_{\sigma (1)}, x_{\sigma (2)}, \ldots , x_{\sigma (n)})\) is integrally convex in x.
-
(3)
For any \(s_{1}, s_{2}, \ldots , s_{n} \in \{ +1, -1 \}\), \(f(s_{1} x_{1}, s_{2} x_{2}, \ldots , s_{n} x_{n})\) is integrally convex in x.
Proof
The claims (1) to (3) follow easily from the definition of integrally convex functions and the obvious relations: \(N(z + x) = \{ z + y \mid y \in N(x) \}\), \( N((x_{\sigma (1)}, \ldots , x_{\sigma (n)})) = \{ (y_{\sigma (1)}, \ldots , y_{\sigma (n)}) \mid y \in N(x) \}\), and \(N((s_{1} x_{1}, \ldots , s_{n} x_{n})) = \{ (s_{1} y_{1}, \ldots , s_{n} y_{n}) \mid y \in N(x) \}\). \(\square \)
Integral convexity of a function can be characterized by a local condition under the assumption that the effective domain is an integrally convex set. The following theorem is proved in [2] when the effective domain is an integer interval (discrete rectangle). An alternative proof, which is also valid for the general case, is given in “Appendix A”.
Theorem 2.4
([2, Proposition 3.3]) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be a function with an integrally convex effective domain. Then the following properties are equivalent:
-
(a)
f is integrally convex.
-
(b)
For every \(x, y \in \mathrm{dom\,}f\) with \(\Vert x - y \Vert _{\infty } =2\) we have
$$\begin{aligned} {\tilde{f}}\, \bigg (\frac{x + y}{2} \bigg ) \le \frac{1}{2} (f(x) + f(y)). \end{aligned}$$(2.2)
Theorem 2.5
([2, Proposition 3.1]; see also [23, Theorem 3.21]) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function and \(x^{*} \in \mathrm{dom\,}f\). Then \(x^{*}\) is a minimizer of f if and only if \(f(x^{*}) \le f(x^{*} + d)\) for all \(d \in \{ -1, 0, +1 \}^{n}\).
The local characterization of global minima stated in Theorem 2.5 above can be generalized to the following form; see Fig. 2.
Theorem 2.6
(Box-barrier property) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, and let \(p \in ({\mathbb {Z}} \cup \{ -\,\infty \})^{n}\) and \(q \in ({\mathbb {Z}} \cup \{ +\,\infty \})^{n}\), where \(p \le q\). Define
and \(W = \bigcup _{i=1}^{n} (W_{i}^{+} \cup W_{i}^{-})\). Let \({{\hat{x}}} \in S \cap \mathrm{dom\,}f\). If \(f({{\hat{x}}}) \le f(y)\) for all \(y \in W\), then \(f({{\hat{x}}}) \le f(z)\) for all \(z \in {{\mathbb {Z}}}^{n} {\setminus } S\).
Proof
Let \(U = \bigcup _{i=1}^{n} \{ x \in {\mathbb {R}}^{n} \mid x_{i} \in \{ p_{i}, q_{i} \}, \ p_{j} \le x_{j} \le q_{j} \ (j \not = i) \}\), for which we have \(U \cap {\mathbb {Z}}^{n} = W\). For a point \(z \in {{\mathbb {Z}}}^{n} {\setminus } S\), the line segment connecting \({{\hat{x}}}\) and z intersects U at a point, say, \(u \in {{\mathbb {R}}}^{n}\). Then its integral neighborhood N(u) is contained in W. Since the local convex extension \({\tilde{f}}(u)\) is a convex combination of the f(y)’s with \(y \in N(u)\), and \(f(y) \ge f({{\hat{x}}})\) for every \(y \in W\), we have \({\tilde{f}}(u) \ge f({{\hat{x}}})\). On the other hand, it follows from integral convexity that \({\tilde{f}}(u) \le (1 - \lambda ) f({{\hat{x}}}) + \lambda f(z)\) for some \(\lambda \) with \(0 < \lambda \le 1\). Hence \(f({{\hat{x}}}) \le {\tilde{f}}(u) \le (1 - \lambda ) f({{\hat{x}}}) + \lambda f(z)\), and therefore, \(f({{\hat{x}}}) \le f(z)\). \(\square \)
Theorem 2.5 is a special case of Theorem 2.6 with \(p={{\hat{x}}} - \varvec{1}\) and \(q={{\hat{x}}} + \varvec{1}\). Another special case of Theorem 2.6 with \(p_{j} = -\,\infty \)\((j=1,\ldots ,n)\) and \(q_{j} = +\,\infty \)\((j \not = i)\) for a particular i takes the following form, which we use in Sect. 5.3.
Corollary 2.7
(Hyperplane-barrier property). Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function. Let \({{\hat{x}}} \in \mathrm{dom\,}f\), \(q \in {{\mathbb {Z}}}\), and let i be an integer with \(1 \le i \le n\). If \({{\hat{x}}}_{i} < q\) and \(f({{\hat{x}}}) \le f(y)\) for all \(y \in {{\mathbb {Z}}}^{n}\) with \(y_{i} = q\), then \(f({{\hat{x}}}) \le f(z)\) for all \(z \in {{\mathbb {Z}}}^{n}\) with \(z_{i} \ge q\).
We denote the sets of nonnegative integers and positive integers by \({{\mathbb {Z}}}_{+}\) and \({{\mathbb {Z}}}_{++}\), respectively. For \(\alpha \in {{\mathbb {Z}}}\) we write \(\alpha {{\mathbb {Z}}}\) for \(\{ \alpha x \mid x \in {{\mathbb {Z}}}\}\). For vectors \(a, b \in {{\mathbb {R}}}^{n}\) with \(a \le b\), \([a,b]_{{{\mathbb {R}}}}\) denotes the interval between a and b, i.e., \([a,b]_{{{\mathbb {R}}}} = \{ x \in {{\mathbb {R}}}^{n} \mid a \le x \le b \}\), and \([a,b]_{{{\mathbb {Z}}}}\) the integer interval between a and b, i.e., \([a,b]_{{{\mathbb {Z}}}} = \{ x \in {{\mathbb {Z}}}^{n} \mid a \le x \le b \}\).
3 The scaling operation for integrally convex functions
In this section we consider the scaling operation for integrally convex functions. Recall that, for \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) and \(\alpha \in {\mathbb {Z}}_{++}\), the \(\alpha \)-scaling of f is defined to be the function \(f^{\alpha }: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) given by \(f^{\alpha }(x) = f(\alpha x) \)\((x \in {\mathbb {Z}}^{n})\).
When \(n = 2\), integral convexity is preserved under scaling. We first deal with integrally convex sets.
Proposition 3.1
Let \(S \subseteq {\mathbb {Z}}^{2}\) be an integrally convex set and \(\alpha \in {\mathbb {Z}}_{++}\). Then \(S^{\alpha } = \{ x \in {\mathbb {Z}}^{2} \mid \alpha x \in S \}\) is an integrally convex set.
Proof
By Proposition 2.1 we can assume that S is represented as \(S = \{ (x_1,x_2) \in {{\mathbb {Z}}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) for some \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m)\). Since \((y_1, y_2) \in S^{\alpha }\) if and only if \((\alpha y_1, \alpha y_2) \in S\), we have
where \(r'_{i} = \lfloor r_{i} / \alpha \rfloor \)\((i=1,\ldots ,m)\). By Proposition 2.1 this implies integral convexity of \(S^{\alpha }\). \(\square \)
Next we turn to integrally convex functions.
Theorem 3.2
Let \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function and \(\alpha \in {\mathbb {Z}}_{++}\). Then the scaled function \(f^{\alpha }\) is integrally convex.
Proof
The effective domain \(\mathrm{dom\,}f^{\alpha } = ( \mathrm{dom\,}f \cap (\alpha {{\mathbb {Z}}})^{2} )/\alpha \) is an integrally convex set by Proposition 3.1. By Theorem 2.4 and Proposition 2.3, we only have to check condition (2.2) for \(f^{\alpha }\) with \(x =(0,0)\) and \(y = (2,0)\), (2, 2), (2, 1). That is, it suffices to show
The first two inequalities, (3.1) and (3.2), follow easily from integral convexity of f, whereas (3.3) is a special case of the basic parallelogram inequality (3.4) below with \(a = b = \alpha \). \(\square \)
Proposition 3.3
(Basic parallelogram inequality). For an integrally convex function \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) we have
Proof
We may assume \(a, b \ge 1\) and \(\{ (0,0), (a+b,a) \} \subseteq \mathrm{dom\,}f\), since otherwise the inequality (3.4) is trivially true. Since \(\mathrm{dom\,}f\) is integrally convex, Corollary 2.2 implies that \(k(1,1) +l(1,0) \in \mathrm{dom\,}f\) for all (k, l) with \(0 \le k \le a\) and \(0 \le l\le b\). We use the notation \(f_{x}(z)=f(x+z)\). For each \(x \in \mathrm{dom\,}f\) we have
by integral convexity of f. By adding these inequalities for \(x=k(1,1) +l(1,0)\) with \(0 \le k \le a-1\) and \(0 \le l \le b-1\), we obtain (3.4). Note that all the terms involved in these inequalities are finite, since \(k(1,1) +l(1,0) \in \mathrm{dom\,}f\) for all k and l. \(\square \)
If \(n \ge 3\), \(f^{\alpha }\) is not always integrally convex. This is demonstrated by the following example.
Example 3.1
Consider the integrally convex function \(f: {\mathbb {Z}}^{3} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) defined on \(\mathrm{dom\,}f = [(0,0,0), (4,2,2)]_{{{\mathbb {Z}}}}\) by
For the scaling with \(\alpha = 2\), we have a failure of integral convexity. Indeed, for \(x = (0,0,0)\) and \(y = (2, 1, 1)\) we have
which shows the failure of (2.2) in Theorem 2.4. The set \(S = \arg \min f = \{ x \mid f(x)=0 \}\) is an integrally convex set, and \(S^{\alpha } = \{ x \mid \alpha x \in S \} = \{ (0,0,0), (1,0,0), (1,0,1), (2,1,1) \}\) is not an integrally convex set.
In view of the fact that the class of \(\mathrm{L}^{\natural }\)-convex functions is stable under scaling, while this is not true for the superclass of integrally convex functions, we are naturally led to the question of finding an intermediate class of functions that is stable under scaling. See Sect. 7 for this issue.
4 Preliminary discussion on proximity theorems
Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) and \(\alpha \in {{\mathbb {Z}}}_{++}\). We say that \(x^{\alpha } \in \mathrm{dom\,}f\) is an \(\alpha \)-local minimizer of f (or \(\alpha \)-local minimal for f) if \(f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d)\) for all \(d \in \{ -1,0, +1 \}^{n}\). In general terms a proximity theorem states that for \(\alpha \in {{\mathbb {Z}}}_{++}\) there exists an integer \(B(n,\alpha ) \in {{\mathbb {Z}}}_{+}\) such that if \(x^{\alpha }\) is an \(\alpha \)-local minimizer of f, then there exists a minimizer \(x^{*}\) of f satisfying \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le B(n,\alpha )\), where \(B(n,\alpha )\) is called the proximity distance.
Before presenting a proximity theorem for integrally convex functions in Sect. 5, we establish in this section lower bounds for the proximity distance. We also present a proximity theorem for \(n=2\), as the proof is fairly simple in this particular case, though the proof method does not extend to general \(n \ge 3\).
4.1 Lower bounds for the proximity distance
The following examples provide us with lower bounds for the proximity distance. The first three demonstrate the tightness of the bounds for separable convex functions, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions given in Theorems 1.2, 1.3 and 1.4, respectively.
Example 4.1
(Separable convex function) Let \(\varphi (t) = \max (-t, (\alpha -1)(t - \alpha ) )\) for \(t \in {{\mathbb {Z}}}\) and define \(f(x) = \varphi (x_{1}) + \cdots + \varphi (x_{n})\), which is separable convex. This function has a unique minimizer at \(x^{*}=(\alpha -1, \ldots , \alpha -1)\), whereas \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal and \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = \alpha - 1\). This shows the tightness of the bound \(\alpha - 1\) given in Theorem 1.2.
Example 4.2
(\(\mathrm{L}^{\natural }\)-convex function) Consider \(X \subseteq {{\mathbb {Z}}}^{n}\) defined by
where \( \varvec{1}_{ \{ 1,2,\ldots , i \} } = ( \overbrace{1,1,\ldots , 1}^{i}, 0,0,\ldots , 0)\). The function f defined by \(f(x)=-x_{1}\) on \(\mathrm{dom\,}f =X\) is an \(\mathrm{L}^{\natural }\)-convex function and has a unique minimizer at \(x^{*} = (n (\alpha -1),(n-1)(\alpha -1),\ldots ,2(\alpha -1),\alpha -1)\). On the other hand, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(X \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\). We have \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = n (\alpha - 1)\), which shows the tightness of the bound \(n (\alpha - 1)\) given in Theorem 1.3. This example is a reformulation of [26, Remark 2.3] for L-convex functions to \(\mathrm{L}^{\natural }\)-convex functions.
Example 4.3
(\(\mathrm{M}^{\natural }\)-convex function) Consider \(X \subseteq {{\mathbb {Z}}}^{n}\) defined by
The function f defined by \(f(x)=-x_{1}\) on \(\mathrm{dom\,}f =X\) is an \(\mathrm{M}^{\natural }\)-convex function and has a unique minimizer at \(x^{*} = (n (\alpha -1),-\,(\alpha -1), -\,(\alpha -1),\ldots ,-\,(\alpha -1))\). On the other hand, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(X \cap \{ -\,\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\). We have \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = n (\alpha - 1)\), which shows the tightness of the bound \(n (\alpha - 1)\) given in Theorem 1.4. This example is a reformulation of [26, Remark 2.8] for M-convex functions to \(\mathrm{M}^{\natural }\)-convex functions.
For integrally convex functions with \(n \ge 3\), the bound \(n (\alpha -1)\) is no longer valid. This is demonstrated by the following examples.
Example 4.4
Consider an integrally convex function \(f: {\mathbb {Z}}^{3} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) defined on \(\mathrm{dom\,}f = [(0,0,0), (4,2,2)]_{{{\mathbb {Z}}}}\) by
and let \(\alpha = 2\). For \(x^{\alpha }=(0,0,0)\) we have \(f(x^{\alpha })=0\) and \(f(x^{\alpha }) \le f(x^{\alpha }+ 2 d)\) for \(d=(1,0,0), (0,1,0)\), (0, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (1, 1, 1). Hence \(x^{\alpha }=(0,0,0)\) is \(\alpha \)-local minimal. A unique (global) minimizer of f is located at \(x^{*}=(4,2,2)\) with \(f(x^{*})=-\,4\) and \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } = 4\). The \(\ell _{\infty }\)-distance between \(x^{\alpha }\) and \(x^{*}\) is strictly larger than \(n (\alpha -1) = 3\). We remark that the scaled function \(f^{\alpha }\) is not integrally convex.
The following example demonstrates a quadratic lower bound in n for the proximity distance for integrally convex functions.
Example 4.5
For a positive integer \(m \ge 1\), we consider two bipartite graphs \(G_{1}\) and \(G_{2}\) on vertex bipartition \((\{ 0^{+}, 1^{+},\ldots ,m^{+} \}, \{ 0^{-}, 1^{-},\ldots ,m^{-} \})\); see Fig. 3. The edge sets of \(G_{1}\) and \(G_{2}\) are defined respectively as \(E_{1} = \{ (0^{+}, 0^{-}) \} \cup \{ (i^{+}, j^{-}) \mid i,j=1,\ldots ,m \}\) and \(E_{2} = \{ (0^{+}, j^{-}) \mid j=1,\ldots ,m \} \cup \{ (i^{+}, 0^{-}) \mid i=1,\ldots ,m \}\). Let \(V^{+} = \{ 1^{+},\ldots ,m^{+} \}\), \(V^{-} = \{ 1^{-},\ldots ,m^{-} \}\), and \(n = 2m + 2\). Consider \(X_{1}, X_{2} \subseteq {{\mathbb {Z}}}^{n}\) defined by
where \(X_{1}\) and \(X_{2}\) represent the sets of boundaries of flows in \(G_{1}\) and \(G_{2}\), respectively. We define functions \(f_{1},f_{2} : {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) with \(\mathrm{dom\,}f_{1} = X_{1}\) and \(\mathrm{dom\,}f_{2} = X_{2}\) by
where \(x(U) = \sum _{u \in U} x_{u}\) for any set U of vertices. Both \(f_{1}\) and \(f_{2}\) are M-convex, and hence \(f = f_{1} + f_{2}\) is an \(\mathrm{M}_{2}\)-convex function, which is integrally convex (see [23, Section 8.3.1]). We have \(\mathrm{dom\,}f = \mathrm{dom\,}f_{1} \cap \mathrm{dom\,}f_{2} = X_{1} \cap X_{2}\) and f is linear on \(\mathrm{dom\,}f\). As is easily verified, f has a unique minimizer at \(x^{*}\) defined by
which corresponds to \(\lambda _{0} = m^{2}(\alpha -1)\), \(\lambda _{ij} = \alpha -1\), \(\mu _{i} = \nu _{j} = m(\alpha -1)\)\((i,j=1,\ldots ,m)\). We mention that the function f here is constructed in [26, Remark 2.19] for a slightly different purpose (i.e., for \(\mathrm{M}_{2}\)-proximity theorem).
Let \(x^{\alpha } = \mathbf{0}\). Obviously, \(\mathbf{0}\in \mathrm{dom\,}f\). Moreover, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(\mathrm{dom\,}f \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\), as shown below. Since \(\Vert x^{*}-x^{\alpha }\Vert _{\infty } = m^{2}(\alpha -1) = (n-2)^{2}(\alpha -1)/4\), we obtain a quadratic lower bound \((n-2)^{2}(\alpha -1)/4\) for the proximity distance for integrally convex functions.
The proof of \(\mathrm{dom\,}f \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\) goes as follows. Let \(x \in X_{1} \cap X_{2} \cap \{ -\alpha , 0,\alpha \}^{n}\). We have \(x_{0^{+}} \in \{ 0, \alpha \}\) and \(x_{0^{-}} \in \{ 0, -\alpha \}\). We consider four cases to conclude that \(x = \varvec{0}\).
-
(i)
Case of \(x_{0^{+}} = x_{0^{-}} = 0\): The structure of \(X_{2}\) forces \(x = \varvec{0}\).
-
(ii)
Case of \(x_{0^{+}} = \alpha \), \(x_{0^{-}} = 0\): The structure of \(X_{2}\) forces \(x_{i^{+}} = 0\) for \(i=1, \ldots , m\) and
$$\begin{aligned} x_{j^{-}} = \left\{ \begin{array}{ll} -\,\alpha &{} (j = j_{0}), \\ 0 &{} (j \not = j_{0}) \\ \end{array}\right. \end{aligned}$$for some \(j_{0}\)\((1 \le j_{0} \le m\)), but this is impossible by the structure of \(X_{1}\).
-
(iii)
Case of \(x_{0^{+}} = 0\), \(x_{0^{-}} = -\,\alpha \): The proof is similar to that of (ii) above.
-
(iv)
Case of \(x_{0^{+}} = \alpha \), \(x_{0^{-}} = -\,\alpha \): The structure of \(X_{2}\) forces
$$\begin{aligned} x_{i^{+}} = \left\{ \begin{array}{ll} \alpha &{} (i = i_{0}), \\ 0 &{} (i \not = i_{0}), \\ \end{array}\right. \quad x_{j^{-}} = \left\{ \begin{array}{ll} -\,\alpha &{} (j = j_{0}), \\ 0 &{} (j \not = j_{0}) \\ \end{array}\right. \end{aligned}$$for some \(i_{0}\)\((1 \le i_{0} \le m\)) and \(j_{0}\)\((1 \le j_{0} \le m\)), but this is impossible by the structure of \(X_{1}\).
We have seen that the proximity theorem with the linear bound \(n (\alpha -1)\) does not hold for all integrally convex functions. Then a natural question arises: can we establish a proximity theorem at all by enlarging the proximity bound? This question is answered in the affirmative in Sect. 5.
4.2 A proximity theorem for integrally convex functions with \(n=2\)
In the case of \(n=2\) the proximity bound \(n (\alpha - 1) = 2 (\alpha - 1)\) is valid for integrally convex functions.Footnote 2
Theorem 4.1
Let \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, \(\alpha \in {\mathbb {Z}}_{++}\), and \(x^{\alpha } \in \mathrm{dom\,}f\). If \(f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d)\) for all \(d \in \{ -1,0, +1 \}^{2}\), then there exists a minimizer \(x^{*} \in {{\mathbb {Z}}}^{2}\) of f with \( \Vert x^{\alpha } - x^{*} \Vert _{\infty } \le 2 (\alpha - 1)\).
Proof
We may assume \(\alpha \ge 2\) and \(x^{\alpha } = \varvec{0}\) by Proposition 2.3 (1). Define
Let \(\mu \) be the minimum of \(f(x_{1},x_{2})\) over \((x_{1},x_{2}) \in S\) and let \(({{\hat{x}}}_{1},{{\hat{x}}}_{2})\) be a point in S with \(f({{\hat{x}}}_{1},{{\hat{x}}}_{2})=\mu \). Then
We will show that
Then, by Corollary 2.7 (hyperplane-barrier property), it follows that \(f(z_{1},z_{2}) \ge \mu \) for all \((z_{1},z_{2}) \in C\), that is, there is no \((z_{1},z_{2}) \in C {\setminus } S\) with \(f(z_{1},z_{2}) < \mu \). This proves the claim of the theorem, since \({{\mathbb {Z}}}^{2}\) can be covered by eight sectors similar to C and Proposition 2.3 holds.
The basic parallelogram inequality (3.4) with \(a = k\) and \(b =2 \alpha -1-k\) yields
Case 1 \(0 \le k \le \alpha -1\). Since \(2 \alpha -1-k \ge \alpha \), by convexity of f(t, 0) in t, we have
On the other hand, \(f(k,k) \ge \mu \) by (4.1). Then it follows from (4.3) that
Case 2 \(\alpha \le k \le 2 \alpha -1\). Since \(k \ge \alpha \), by convexity of f(t, t) in t, we have
On the other hand, \(f(2 \alpha -1-k,0) \ge \mu \) by (4.1). Then it follows from (4.3) that
We have thus shown (4.2), completing the proof of Theorem 4.1. \(\square \)
5 A proximity theorem for integrally convex functions
In this section we establish a proximity theorem for integrally convex functions in an arbitrary number of variables.
5.1 Main result
Theorem 5.1
Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, \(\alpha \in {\mathbb {Z}}_{++}\), and \(x^{\alpha } \in \mathrm{dom\,}f\).
-
(1)
If
$$\begin{aligned} f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d) \qquad (\forall \ d \in \{ -1,0, +1 \}^{n}), \end{aligned}$$(5.1)then \(\arg \min f \not = \emptyset \) and there exists \(x^{*} \in \arg \min f\) with
$$\begin{aligned} \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \beta _{n} (\alpha - 1), \end{aligned}$$(5.2)where \(\beta _{n}\) is defined by
$$\begin{aligned} \beta _{1}=1, \quad \beta _{2}=2; \qquad \beta _{n} = \frac{n+1}{2} \beta _{n-1} + 1 \quad (n=3,4,\ldots ). \end{aligned}$$(5.3) -
(2)
The coefficient \(\beta _{n}\) of the proximity bound satisfies
$$\begin{aligned} \beta _{n} \le \frac{(n+1)!}{2^{n-1}} \qquad (n=3,4,\ldots ). \end{aligned}$$(5.4)
The numerical values of \(\beta _{n}\) and its bounds are as follows:
Remark 5.1
The bound (5.2) can be strengthened to \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \lfloor \beta _{n} (\alpha - 1) \rfloor \), but \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \lfloor \beta _{n} \rfloor (\alpha - 1)\) may not be correct (our proof does not justify this).
To prove Theorem 5.1 (1) we first note that the theorem follows from its special case where \(x^{\alpha } = \varvec{0}\) and f is defined on a bounded set in the nonnegative orthant \({{\mathbb {Z}}}_{+}^{n}\). That is, the proof of Theorem 5.1 (1) is reduced to proving the following proposition. We use the notation \(N=\{1,2,\ldots ,n\}\) and \(\varvec{1}_{A}\) for the characteristic vector of \(A \subseteq N\).
Proposition 5.2
Let \(\alpha \in {{\mathbb {Z}}}_{++}\) and \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function such that \(\mathrm{dom\,}f\) is a bounded subset of \({{\mathbb {Z}}}_{+}^{n}\) containing the origin \(\varvec{0}\). If
then there exists \(x^{*} \in \arg \min f\) with
where \(\beta _{n}\) is defined by (5.3).
Suppose that Proposition 5.2 has been established. Then Theorem 5.1 (1) can be derived from Proposition 5.2 in three steps:
-
1.
We may assume \(x^{\alpha } = \varvec{0}\) by Proposition 2.3 (1).
-
2.
We may further assume that \(\mathrm{dom\,}f\) is bounded. Let M be a sufficiently large integer, say, \(M \ge \beta _{n} (\alpha - 1) + 1\), and \(f_{M}\) be the restriction of f to the integer interval \([ -M \varvec{1}, M \varvec{1} ]_{{{\mathbb {Z}}}}\), where \(\varvec{1}=(1,1,\ldots , 1)\). Then \(x^{\alpha } = \varvec{0}\) is \(\alpha \)-local minimal for \(f_{M}\). If the special case of Theorem 5.1 with \(x^{\alpha } = \varvec{0}\) and bounded \(\mathrm{dom\,}f\) is true, then there exists \(x^{*} \in \arg \min f_{M}\) satisfying \( \Vert x^{*}\Vert _{\infty } \le \beta _{n} (\alpha - 1)\). Since \(x^{*} \in \arg \min f_{M}\) we have \(f_{M}(x^{*}) \le f_{M}(x^{*}+ d)\)\((\forall \, d \in \{ -1,0, +1 \}^{n})\), which implies \(f(x^{*}) \le f(x^{*}+ d)\)\((\forall \, d \in \{ -1,0, +1 \}^{n})\). Then Theorem 2.5 shows that \(x^{*} \in \arg \min f\).
-
3.
We consider \(2^{n}\) orthants separately. For each \(s=(s_{1}, s_{2}, \ldots , s_{n}) \in \{ +1, -1 \}^{n}\) we consider the function \(f_{s}(x)=f(s x)\) on \({{\mathbb {Z}}}_{+}^{n}\), where \(s x = (s_{1} x_{1}, s_{2} x_{2}, \ldots , s_{n} x_{n})\). Noting that \(\mathrm{dom\,}f_{s}\) is a bounded subset of \({{\mathbb {Z}}}_{+}^{n}\), we apply Proposition 5.2 to \(f_{s}\) to obtain \(x^{*}_{s}\) with \( \Vert x^{*}_{s} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\). From among \(2^{n}\) such \(x^{*}_{s}\), take the one with the function value \(f(s x^{*}_{s})\) minimum. Then \(x^{*} = s x^{*}_{s}\) is a minimizer of f, and satisfies \( \Vert x^{*} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\).
5.2 Tools for the proof: f-minimality
In this section we introduce some technical tools that we use in the proof of Proposition 5.2.
For \(A\;(\ne \emptyset ) \subseteq N\), we consider a set of integer vectors
and the cones of their nonnegative integer and real combinations
where \(C_{A}\) is often referred to as the integer cone generated by \(B_{A}\). We first note the following fact, which provides us with a clearer geometric view, though it is not used in the proof of Proposition 5.2.
Proposition 5.3
\(B_{A}\) is a Hilbert basis of the convex cone \({\tilde{C}}_{A}\) generated by \(B_{A}\). That is, \(C_{A} = {\tilde{C}}_{A} \cap {\mathbb {Z}}^{n}\).
Proof
The proof is given in “Appendix B”. \(\square \)
For two nonnegative integer vectors \(x, y \in {{\mathbb {Z}}}_{+}^{n}\), we write \(y \preceq _{f} x\) if \(y \le x\) and \(f(y) \le f(x)\). Note that \(y \preceq _{f} x\) if and only if \((y,f(y)) \le (x,f(x))\) in \({{\mathbb {R}}}^{n} \times ({{\mathbb {R}}}\cup \{ +\,\infty \})\). We say that \(x \in {{\mathbb {Z}}}_{+}^{n}\) is f-minimal if \(x \in \mathrm{dom\,}f\) and there exists no \(y \in {{\mathbb {Z}}}_{+}^{n}\) such that \(y \preceq _{f} x\) and \(y \not = x\). That is,Footnote 3x is f-minimal if and only if it is the unique minimizer of the function f restricted to the integer interval \([\varvec{0},x]_{{{\mathbb {Z}}}}\).
The goal of this section is to establish the following connection between f-minimality and the integer cone \(C_{A}\) based at \(\alpha \varvec{1}_{A}\).
Proposition 5.4
Assume \(\alpha \)-local minimality (5.6). If \(y \in {{\mathbb {Z}}}_{+}^{n}\) is f-minimal, then \(y \not \in \alpha \varvec{1}_{A} + C_{A}\) for any \(A (\ne \emptyset ) \subseteq N\).
Our proof of this proposition is based on several lemmas.
Lemma 5.5
Assume \(\alpha \)-local minimality (5.6). For any \(A\;(\ne \emptyset ) \subseteq N\) and \(\lambda \in {{\mathbb {Z}}}_{+}\) we have \((\alpha -1) \varvec{1}_{A} \preceq _{f} (\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A}\).
Proof
First note that \((\alpha -1) \varvec{1}_{A} \le (\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A}\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\). By integral convexity of f, \(g(\lambda ) = f(\lambda \varvec{1}_{A})\) is a discrete convex function in \(\lambda \in {{\mathbb {Z}}}_{+}\), and therefore,
On the other hand, \(g(0) \le g(\alpha )\) by the assumed \(\alpha \)-local minimality (5.6). Hence we have \(g(\alpha -1) \le g(\alpha )\). Since \(g(0) < +\,\infty \), by discrete convexity of g, this implies \(g(\alpha -1) \le g((\alpha -1)+\lambda )\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\), i.e., \(f((\alpha -1) \varvec{1}_{A}) \le f((\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A})\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\). \(\square \)
Lemma 5.6
Let \(x \in \mathrm{dom\,}f\), \(A\;(\ne \emptyset ) \subseteq N\), and assume \(x \preceq _{f} x + \varvec{1}_{A}\). Then for any \(i \in N\), \(\delta \in \{ +1, 0, -1 \}\), and \(\lambda \in {{\mathbb {Z}}}_{+}\) we have \(x+\varvec{1}_{A}+ \delta \varvec{1}_{i} \preceq _{f} (x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}\).
Proof
First note that \(x+\varvec{1}_{A}+ \delta \varvec{1}_{i} \le (x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}\). We only need to show \(f(x+\varvec{1}_{A}+ \delta \varvec{1}_{i}) \le f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A})\) when \(f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}) < +\,\infty \). By integral convexity of f we have
whereas \(f(x+\varvec{1}_{A}) \ge f(x)\) by the assumption. Hence \(f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}) \ge f(x+\varvec{1}_{A}+ \delta \varvec{1}_{i})\). \(\square \)
Lemma 5.7
Let \(x \in \mathrm{dom\,}f\), \(A\;(\ne \emptyset ) \subseteq N\), and assume \(x \preceq _{f} x + \varvec{1}_{A}\). For any \(\lambda \in {{\mathbb {Z}}}_{+}\), \(\mu _{i}^{+},\mu _{i}^{-} \in {{\mathbb {Z}}}_{+} \;(i \in A)\), and \(\mu _{i}^{\circ } \in {{\mathbb {Z}}}_{+}\;(i \in N {\setminus } A)\), the point
is not f-minimal.
Proof
By the definition of an f-minimal point, we assume \(y \in \mathrm{dom\,}f\); since otherwise we are done. Define
which serves, in our proof, as an index to measure the distance between x and y. If \(\mu \le 1\), then y is not f-minimal by Lemma 5.6. Suppose that \(\mu \ge 2\). In the following we construct a vector \(x'\) such that \(x' \in \mathrm{dom\,}f\), \(x' \preceq _{f} x' + \varvec{1}_{A}\), y is represented as (5.11) with \(x'\) in place of x, and the index \(\mu '\) for that representation is strictly smaller than \(\mu \).
Define \(\beta = \mu + \lambda + 1\) and
where we may assume, without loss of generality, that \(A^{+} \cap A^{-} = \emptyset \). Then (5.11) can be rewritten as
which shows
Consider the point
which is not contained in \({{\mathbb {Z}}}^{n}\) since \( z = x + (y-x)/\beta \) and \(\displaystyle 1 \le \max \big ( \max \nolimits _{i \in A^{+}} \mu _{i}^{+}, \max \nolimits _{i \in A^{-}} \mu _{i}^{-}, \max \nolimits _{i \in A^{\circ }} \mu _{i}^{\circ } \big ) \le \beta - 1\) with \(A^{+} \cup A^{-} \cup A^{\circ } \not = \emptyset \). Since f is integrally convex and \(x, y \in \mathrm{dom\,}f\), we have \({\tilde{f}}(z) \le ((\beta -1)/\beta ) f(x) + (1/\beta ) f(y) < +\,\infty \). On the other hand, since
the integral neighborhood N(z) of z consists of all points \(x'\) that can be represented as
for a subset D of \(A^{+} \cup A^{-} \cup A^{\circ }\). Since \({\tilde{f}}(z) < +\,\infty \) and \(z \not \in {{\mathbb {Z}}}^{n}\), we must have \(|N(z) \cap \mathrm{dom\,}f| \ge 2\), which implies that there exists a nonempty D for which \(x' \in \mathrm{dom\,}f\). Take such D that is minimal with respect to set inclusion.
We claim that \(x' \preceq _{f} x' + \varvec{1}_{A}\). Obviously we have \(x' \le x' + \varvec{1}_{A}\). To show \(f(x') \le f(x' + \varvec{1}_{A})\), we may assume \(x' + \varvec{1}_{A} \in \mathrm{dom\,}f\). Then we have the following chain of inequalities:
which shows \(f(x' + \varvec{1}_{A}) \ge f(x')\). Therefore, \(x' \preceq _{f} x' + \varvec{1}_{A}\) is true.
We finally consider the index (5.12) associated with \(x'\), which we denote by \(\mu '\). The substitution of (5.13) into (5.11) yields
This shows \(\mu ' = \mu - |D| \le \mu -1\).
The above procedure finds \(x' \in \mathrm{dom\,}f\) such that \(x' \preceq _{f} x' + \varvec{1}_{A}\) and \(\mu '\le \mu -1\), when given \(x \in \mathrm{dom\,}f\) such that \(x \preceq _{f} x + \varvec{1}_{A}\) and \(\mu \ge 2\). By repeated application of this procedure we can eventually arrive at \(x'' \in \mathrm{dom\,}f\) such that \(x'' \preceq _{f} x'' + \varvec{1}_{A}\) and \(\mu ''\le 1\). Then y is not f-minimal by Lemma 5.6 for \(x''\). \(\square \)
Lemma 5.8
If \((\alpha \varvec{1}_{A} + C_A) \cap (\mathrm{dom\,}f) \ne \emptyset \), then \(\alpha \varvec{1}_{A} \in \mathrm{dom\,}f\).
Proof
To prove by contradiction, take \(y \in (\alpha \varvec{1}_{A} + C_A) \cap (\mathrm{dom\,}f)\) that is minimal with respect to the vector ordering (componentwise ordering) and assume that \(y \ne \alpha \varvec{1}_{A}\). The vector y can be represented as
with some \(\mu _{i}^{+},\mu _{i}^{-} \in {{\mathbb {Z}}}_{+} \;(i \in A)\), \(\mu _{i}^{\circ } \in {{\mathbb {Z}}}_{+}\;(i \in N {\setminus } A)\), and \(\lambda \in {{\mathbb {Z}}}_{+}\), where
is strictly larger than \(\alpha \) since \(y \ne \alpha \varvec{1}_{A}\). Define
where we may assume, without loss of generality, that \(A^{+} \cap A^{-} = \emptyset \) and \(A^{-} \not = A\). We have
Consider the point
Since f is integrally convex and \(y, \mathbf{0}\in \mathrm{dom\,}f\), we have \({\tilde{f}}(z) \le ((\beta -1)/\beta ) f(y) + (1/\beta ) f(\mathbf{0}) < +\,\infty \). Note that
If \(A^{+} \cup A^{-} \cup A^{\circ } = \emptyset \), we are done with a contradiction. Indeed, we then have \(z = (\alpha + \lambda -1 ) \varvec{1}_{A}\) and \(y = (\alpha + \lambda ) \varvec{1}_{A}\), and hence \(z \le y\), \(z \not = y\), and \(z \in \mathrm{dom\,}f\) by \(f(z)={\tilde{f}}(z) < +\,\infty \). In the following we assume \(A^{+} \cup A^{-} \cup A^{\circ } \not = \emptyset \), which implies \(z \not \in {{\mathbb {Z}}}^{n}\).
The integral neighborhood N(z) of z consists of all points \(y'\) that can be represented as
for a subset D of \(A^{+} \cup A^{-} \cup A^{\circ }\). Since \({\tilde{f}}(z) < +\,\infty \) and \(z \not \in {{\mathbb {Z}}}^{n}\), we must have \(|N(z) \cap \mathrm{dom\,}f| \ge 2\), which implies that there exists a nonempty D for which \(y' \in \mathrm{dom\,}f\). Take any \(y' \in N(z) \cap \mathrm{dom\,}f\) with \(D \not = \emptyset \). Then \(y' \le y\) and \(y' \ne y\), since \(A^{-} \not = A\) and
We also have \(y' \in (\alpha \varvec{1}_{A} + C_{A})\) by an alternative expression of \(y'\):
Hence \(y' \in (\alpha \varvec{1}_{A} + C_{A}) \cap (\mathrm{dom\,}f)\), a contradiction to the minimality of y. \(\square \)
We are now in the position to prove Proposition 5.4. To prove the contrapositive of the claim, suppose that \(y \in \alpha \varvec{1}_{A} + C_{A}\) for some A. Then y can be expressed as
for some \(\mu _{i}^{+},\mu _{i}^{-}, \mu _{i}^{\circ }, \lambda \in {{\mathbb {Z}}}_{+}\). Equivalently,
which corresponds to the right-hand side of (5.11) with \(x = (\alpha -1) \varvec{1}_{A}\). By Lemma 5.8, we have \(\alpha \varvec{1}_{A} \in \mathrm{dom\,}f\). Since \(x = (\alpha -1) \varvec{1}_{A} \preceq _{f} \alpha \varvec{1}_{A} = x + \varvec{1}_{A}\) by Lemma 5.5, Lemma 5.7 shows that y is not f-minimal. This completes the proof of Proposition 5.4.
5.3 Proof of Proposition 5.2 for \(n=2\)
In this section we prove Proposition 5.2 for \(n=2\) as an illustration of the proof method using the tools introduced in Sect. 5.2. This also gives an alternative proof of Theorem 4.1.
Recall that \(\mathrm{dom\,}f\) is assumed to be a bounded subset of \({{\mathbb {Z}}}_{+}^{2}\), which implies, in particular, that \(\arg \min f \not = \emptyset \). Take \(x^{*}=(x_{1}^{*},x_{2}^{*}) \in \arg \min f\) that is f-minimal. We may assume \(x_{1}^{*} \ge x_{2}^{*}\) by Proposition 2.3 (2). Since \(x^{*}\) is f-minimal, Proposition 5.4 shows that \(x^{*}\) belongs to
where \(A = \{ 1 \}\) and \(N = \{ 1, 2 \}\). On noting
we see that \(X^{*}\) consists of all integer points contained in the parallelogram with vertices (0, 0), \((\alpha -1, 0)\), \((2\alpha -2, \alpha -1)\), \((\alpha -1, \alpha -1)\). Therefore, \( \Vert x^{*}\Vert _{\infty } \le 2(\alpha -1)\). Thus Proposition 5.2 for \(n=2\) is proved.
5.4 Proof of Proposition 5.2 for \(n \ge 3\)
In this section we prove Proposition 5.2 for \(n \ge 3\) by induction on n. Accordingly we assume that Proposition 5.2 is true for every integrally convex function in \(n-1\) variables.
Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function such that \(\mathrm{dom\,}f\) is a bounded subset of \({{\mathbb {Z}}}_{+}^{n}\) containing the origin \(\varvec{0}\). Note that \(\arg \min f \not = \emptyset \) and take \(x^{*}=(x_{1}^{*},x_{2}^{*},\ldots ,x_{n}^{*}) \in \arg \min f\) that is f-minimal. Then
We may assume
by Proposition 2.3 (2).
The following lemma reveals a significant property of integrally convex functions that will be used here for induction on n. Note that, by (5.15), \(x^{*}\) satisfies the condition imposed on \(x^{\bullet }\).
Lemma 5.9
Let \(x^{\bullet } \in \mathrm{dom\,}f\) be an f-minimal point. Then for any \(i \in N\) with \(x^{\bullet }_i \ge 1\) there exists an f-minimal point \(x^{\circ } \in \mathrm{dom\,}f\) such that
Proof
Let \(x^{\circ }\) be a minimizer of f(x) among those x which belong to \(X = \{ x \in {{\mathbb {Z}}}^{n} \mid \mathbf{0}\le x \le x^{\bullet }, \ \Vert x-x^{\bullet } \Vert _\infty = 1, \ x_{i} = x^{\bullet }_{i} -1 \}\); in case of multiple minimizers, we choose a minimal minimizer with respect to the vector ordering (componentwise ordering). To prove f-minimality of \(x^{\circ }\), suppose, to the contrary, that there exists \(z \in [\mathbf{0},x^{\circ }]_{{{\mathbb {Z}}}} {\setminus } \{x^{\circ }\}\) with \(f(z) \le f(x^{\circ })\). We have \(\ell = \Vert z-x^{\bullet } \Vert _\infty \ge 2\), since otherwise \(z \in X\) and this contradicts the minimality of \(x^{\circ }\).
Consider \(y \in {{\mathbb {R}}}_{+}^{n}\) defined by
The value of the local convex extension \({\tilde{f}}\) of f at y can be represented as
with some set \(Y \subseteq N(y) \cap \mathrm{dom\,}f\) and positive coefficients \(\lambda _{j}\) such that
Since \(\Vert y-x^{\bullet } \Vert _\infty = 1\) and \(y \le x^{\bullet }\) by (5.17), either \(y^{j}_{i} = x^{\bullet }_{i}-1\) or \(y^{j}_{i} = x^{\bullet }_{i} \) holds for each \(y^{j} \in Y\). Define
Then we see
from
On the other hand, we have
by integral convexity of f. We divide into cases to derive a contradiction to this inequality.
Case 1 (\(x^{\bullet }_{i}-z_{i} = \ell \)): We have \(Y = Y^{<}\) by (5.18) and (5.19) and then \(f(x^{\circ }) \le f(y^{j})\) for all \(y^{j} \in Y\) by the definition of \(x^{\circ }\). Hence
For the right-hand side of (5.20), note first that the f-minimality of \(x^{\bullet }\) and \(x^{\circ } \in [\mathbf{0},x^{\bullet }]_{{{\mathbb {Z}}}} {\setminus } \{x^{\bullet }\}\) imply \(f(x^{\bullet }) < f(x^{\circ })\). Then it follows from \(f(x^{\bullet }) < f(x^{\circ })\) and \(f(z) \le f(x^{\circ })\) that
But (5.21) and (5.22) together contradict (5.20).
Case 2 (\(x^{\bullet }_{i}-z_{i} < \ell \)): In this case \(Y^{=}\) is nonempty. Since \(x^{\bullet } \not \in Y\) by \(\Vert x^{\bullet } - y \Vert _\infty = 1\), every \(y^{j} \in Y\) is distinct from \(x^{\bullet }\), whereas \(y^{j} \in [\mathbf{0},x^{\bullet }]_{{{\mathbb {Z}}}}\). Then the assumed f-minimality of \(x^{\bullet }\) implies
We also have
which is obvious from the definitions of \(x^{\circ }\) and z. Then we have
This is a contradiction to (5.20). \(\square \)
Lemma 5.9 can be applied repeatedly, since the resulting point \(x^{\circ }\) satisfies the condition imposed on the initial point \(x^{\bullet }\). Starting with \(x^{\bullet } = x^{*}\) we apply Lemma 5.9 repeatedly with \(i = n\). After \(x_{n}^{*}\) applications, we arrive at a point \({\hat{x}} = ({\hat{x}}_{1},{\hat{x}}_{2},\ldots ,{\hat{x}}_{n-1},0)\). This point \({\hat{x}}\) is f-minimal and
We now consider a function \({\hat{f}}: {{\mathbb {Z}}}^{n-1} \rightarrow {{\mathbb {R}}}\cup \{+\,\infty \}\) defined by
This function \({\hat{f}}\) is an integrally convex function in \(n-1\) variables, and the origin \(\mathbf{0}\) is \(\alpha \)-local minimal for \({\hat{f}}\). By the induction hypothesis, we can apply Proposition 5.2 to \({\hat{f}}\) to obtain
Note that \({\hat{x}}\) is the unique minimizer of \({\hat{f}}\).
Combining (5.25) and (5.26) we obtain
We also have
as a consequence of f-minimality of \(x^{*}\); see Lemma 5.10 below. It follows from (5.27) and (5.28) that
This implies
where the recurrence relation
is used.
It remains to derive inequality (5.28) from f-minimality of \(x^{*}\).
Lemma 5.10
The following inequalities hold for \(x^{*}\) and \(\alpha \).
-
(1)
$$\begin{aligned} \sum _{i=1}^{n-1}\left( x_{i}^{*} - x_{n}^{*}\right) \ge x_{n}^{*} - \alpha + 1. \end{aligned}$$(5.29)
-
(2)
$$\begin{aligned} \sum _{i=2}^{n}\left( x_{1}^{*} - x_{i}^{*}\right) \ge x_{1}^{*} - \alpha + 1. \end{aligned}$$(5.30)
-
(3)
$$\begin{aligned} x_{n}^{*} \le \frac{n-1}{n+1} x_{1}^{*} + \frac{2(\alpha - 1)}{n+1}. \end{aligned}$$(5.31)
Proof
(1) To prove by contradiction, suppose that \(\sum _{i=1}^{n-1}(x_{i}^{*} - x_{n}^{*}) \le x_{n}^{*} - \alpha \). Then the expression
shows \(x^{*} \in \alpha \varvec{1}_{N} + C_{N}\). By Proposition 5.4, this contradicts the fact that \(x^{*}\) is f-minimal.
(2) To prove by contradiction, suppose that \(\displaystyle \sum _{i=2}^{n}(x_{1}^{*} - x_{i}^{*}) \le x_{1}^{*} - \alpha \). Then the expression
shows \(x^{*} \in \alpha \varvec{1}_{N} + C_{N}\). By Proposition 5.4, this contradicts the fact that \(x^{*}\) is f-minimal.
(3) Since
the addition of (5.29) and (5.30) yields
which is equivalent to (5.31). \(\square \)
This completes the proof of Proposition 5.2, and hence that of Theorem 5.1 (1).
5.5 Estimation of \(\beta _{n}\)
The estimate of \(\beta _{n}\) given in Theorem 5.1 (2) is derived in this section.
The recurrence relation (5.3) can be rewritten as
from which follows
For the last term we have
since
Substitution of (5.33) into (5.32) yields
Thus the upper bound (5.4) is proved.
6 Optimization of integrally convex functions
In spite of the facts that the factor \(\beta _{n}\) of the proximity bound is superexponential in n and that integral convexity is not stable under scaling, we can design a proximity-scaling type algorithm for minimizing integrally convex functions with bounded effective domains. The algorithm runs in \(C(n) \log _{2} K_{\infty }\) time for some constant C(n) depending only on n, where \(K_{\infty }\)\((>0)\) denotes the \(\ell _{\infty }\)-size of the effective domain. This means that, if the dimension n is fixed and treated as a constant, the algorithm is polynomial in the problem size. Note that no algorithm for integrally convex function minimization can be polynomial in n, since any function on the unit cube \(\{ 0,1 \}^{n}\) is integrally convex.
The proposed algorithm is a modification of the generic proximity-scaling algorithm given in the Introduction. In Step S1, we replace the function \({{\tilde{f}}}(y) = f(x + \alpha y)\) with its restriction to the discrete rectangle \(\{ y \in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} (2\alpha - 1) \}\), which is denoted by \({{\hat{f}}}(y)\). Then a local minimizer of \({{\hat{f}}}(y)\) is found to update x to \(x+ \alpha y\). Note that a local minimizer of \({{\hat{f}}}(y)\) can be found, e.g., by any descent method (the steepest descent method, in particular).
The correctness of the algorithm can be shown as follows. We first assume that f has a unique (global) minimizer \(x^{*}\). Let \(x^{2\alpha }\) denote the vector x at the beginning of Step S1, and define
Note that \(f^{(\alpha )}\) is integrally convex, whereas \({{\hat{f}}}^{(\alpha )}\) is not necessarily so. Let \({{\hat{y}}}^{\alpha }\) be the output of Step S1 and \(x^{\alpha } = x^{2\alpha } + \alpha {{\hat{y}}}^{\alpha }\). Then \({{\hat{y}}}^{\alpha }\) is a local minimizer of \({{\hat{f}}}^{(\alpha )}\) and \(x^{\alpha } - x^{2\alpha } = \alpha {{\hat{y}}}^{\alpha } \in (\alpha {\mathbb {Z}})^{n}\).
Lemma 6.1
\(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) for all \(\alpha \).
Proof
This is obviously true in the initial phase with \(\alpha = 2^{\lceil \log _{2} K_{\infty } \rceil }\). To prove \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) by induction on descending \(\alpha \), we show that \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) implies \(x^{*} \in \mathrm{dom\,}f^{(\alpha /2)}\). Since \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) and \(x^{*} \in \arg \min f\), we have \(x^{*} \in \arg \min f^{(\alpha )}\). On the other hand, \(x^{\alpha }\) is an \(\alpha \)-local minimizer of \(f^{(\alpha )}\), since \({{\hat{y}}}^{\alpha }\) is a local minimizer of \({{\hat{f}}}^{(\alpha )}\). Then, by the proximity theorem (Theorem 5.1) for \(f^{(\alpha )}\), we obtain \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\), which shows \(x^{*} \in \mathrm{dom\,}f^{(\alpha /2)}\). \(\square \)
In the final phase with \(\alpha = 1\), \(f^{(\alpha )}\) is an integrally convex function, and hence, by Theorem 2.5, an \(\alpha \)-local minimizer of \(f^{(\alpha )}\) is a global minimizer of \(f^{(\alpha )}\). This observation, with Lemma 6.1 above, shows that the output of the algorithm is a global minimizer of f.
The complexity of the algorithm can be analyzed as follows. The number of iterations in the descent method is bounded by the total number of points in \(Y = \{ y\in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} (2\alpha -1) \}\), which is bounded by \((4\beta _{n})^{n}\). For each y we examine all of its \(3^{n}\) neighboring points to find a descent direction or verify its local minimality. Thus Step S1, which updates \(x^{2\alpha }\) to \(x^{\alpha }\), can be done with at most \((12\beta _{n})^{n}\) function evaluations. The number of scaling phases is \(\log _{2} K_{\infty }\). Therefore, the time complexity (or the number of function evaluations) is bounded by \((12\beta _{n})^{n}\log _{2} K_{\infty }\). For a fixed n, this gives a polynomial bound \(O(\log _{2} K_{\infty })\) in the problem size.
Finally, we describe how to get rid of the uniqueness assumption of the minimizer. Consider a perturbed function \(f_{\varepsilon }(x) = f(x) + \sum _{i=1}^{n} \varepsilon ^{i} x_{i}\) with a sufficiently small \(\varepsilon >0\). By the assumed boundedness of the effective domain of f, the perturbed function has a minimizer, which is unique as a result of the perturbation. To find the minimum of \(f_{\varepsilon }\) it is not necessary to explicitly introduce parameter \(\varepsilon \) into the algorithm, but a lexicographically smallest local minimizer y of \({{\hat{f}}}(y)\) should be found in Step S1.
Remark 6.1
Some technical points are explained here. By working with \(f^{(\alpha )}\), we can bound the number of iterations for finding an \(\alpha \)-local minimizer in terms of the number of integer vectors contained in \(\mathrm{dom\,}{{\hat{f}}}^{(\alpha )}\). The vector \(x^{\alpha }\) is an \(\alpha \)-local minimizer for \(f^{(\alpha )}\), but not necessarily for the original function f. This is why we apply the proximity theorem to \(f^{(\alpha )}\) in the proof of Lemma 6.1.
Remark 6.2
The proximity bound \(\beta _{n} (\alpha - 1)\) in Theorem 5.1 is linear in \(\alpha \). This linear dependence on \(\alpha \) is critical for the complexity \(O(\log _{2} K_{\infty })\) of the algorithm when n is fixed. Suppose, for example, that the proximity bound is \(\beta _{n} (\alpha ^{m} - 1)\) for some \(m >1\). Then in the above analysis, \((2\alpha -1)\) should be replaced by \(((2\alpha )^{m} - 1)\), and the total number of points in \(Y = \{ y\in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} ((2\alpha )^{m} -1) \}\) is bounded by \((2^{m+1}\beta _{n})^{n} \alpha ^{(m-1)n}\). The sum of \(\alpha ^{(m-1)n}\) over \(\alpha = 1, 2, 2^{2},\ldots , 2^{\lceil \log _{2} K_{\infty } \rceil }\) is of the order of \(K_{\infty }^{(m-1)n}\). Then the proposed algorithm will not be polynomial in \(\log _{2} K_{\infty }\). Thus the particular form \(\beta _{n} (\alpha - 1)\) of our proximity bound is important for our algorithm.
7 Concluding remarks
As shown in this paper, the nice properties of \(\mathrm{L}^{\natural }\)-convex functions such as stability under scaling and the proximity bound \(n(\alpha -1)\) are not shared by integrally convex functions in general. Two subclasses of integrally convex functions which still enjoy these nice properties have been introduced in [19] based on discrete midpoint convexity (1.1) for every pair \((x, y) \in {{\mathbb {Z}}}^{n} \times {{\mathbb {Z}}}^{n}\) with \(\Vert x - y \Vert _{\infty } \ge 2\) or \(\Vert x - y \Vert _{\infty } = 2\). Both classes of such functions are superclasses of \(\mathrm{L}^{\natural }\)-convex functions, subclasses of integrally convex functions, and closed under scaling for all n and admit a proximity theorem with the bound \(n (\alpha -1)\) for all n. See [19] for details.
Notes
\({{\mathbb {Z}}}\)-valued functions are treated in [4, Theorem 3], but the proof is valid for \({{\mathbb {R}}}\)-valued functions.
Recall that \(n=3\) in Example 4.4.
x is f-minimal if and only if \(\arg \min f_{[ \mathbf{0},x ]} = \{ x \}\) for the function \( f_{[\mathbf{0},x]}(y) = \left\{ \begin{array}{cl} f(y) &{}\quad (y \in [\mathbf{0},x]_{{{\mathbb {Z}}}}), \\ +\,\infty &{}\quad (y \in {{\mathbb {Z}}}^{n} {\setminus } [\mathbf{0},x]_{{{\mathbb {Z}}}}). \end{array}\right. \)
See H. Tuy: D.C. optimization: Theory, methods and algorithms, in: R. Horst and P. M. Pardalos, eds., Handbook of Global Optimization, Kluwer Academic Publishers, Dordrecht, 1995, 149–216; Lemma 2 to be specific.
References
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows—Theory, Algorithms and Applications. Prentice-Hall, Englewood Cliffs (1993)
Favati, P., Tardella, F.: Convexity in nonlinear integer programming. Ric. Oper. 53, 3–44 (1990)
Fujishige, S.: Bisubmodular polyhedra, simplicial divisions, and discrete convexity. Discrete Optim. 12, 115–120 (2014)
Fujishige, S., Murota, K.: Notes on L-/M-convex functions and the separation theorems. Math. Program. 88, 129–146 (2000)
Hemmecke, R., Köppe, M., Lee, J., Weismantel, R.: Chapter 15 Nonlinear integer programming. In: Jünger, M., et al. (eds.) 50 Years of Integer Programming 1958–2008, pp. 561–618. Springer, Berlin (2010)
Hirai, H.: L-extendable functions and a proximity scaling algorithm for minimum cost multiflow problem. Discrete Optim. 18, 1–37 (2015)
Hirai, H.: L-convexity on graph structures. J. Oper. Res. Soc. Japan (to appear) (2018)
Hochbaum, D.S.: Complexity and algorithms for nonlinear optimization problems. Ann. Oper. Res. 153, 257–296 (2007)
Hochbaum, D.S., Shanthikumar, J.G.: Convex separable optimization is not much harder than linear optimization. J. Assoc. Comput. Mach. 37, 843–862 (1990)
Ibaraki, T., Katoh, N.: Resource Allocation Problems: Algorithmic Approaches. MIT Press, Boston (1988)
Iimura, T.: Discrete modeling of economic equilibrium problems. Pac. J. Optim. 6, 57–64 (2010)
Iimura, T., Murota, K., Tamura, A.: Discrete fixed point theorem reconsidered. J. Math. Econ. 41, 1030–1036 (2005)
Iimura, T., Watanabe, T.: Existence of a pure strategy equilibrium in finite symmetric games where payoff functions are integrally concave. Discrete Appl. Math. 166, 26–33 (2014)
Iwata, S., Moriguchi, S., Murota, K.: A capacity scaling algorithm for M-convex submodular flow. Math. Program. 103, 181–202 (2005)
Iwata, S., Shigeno, M.: Conjugate scaling algorithm for Fenchel-type duality in discrete convex optimization. SIAM J. Optim. 13, 204–211 (2002)
Katoh, N., Shioura, A., Ibaraki, T.: Resource allocation problems. In: Pardalos, P.M., Du, D.-Z., Graham, R.L. (eds.) Handbook of Combinatorial Optimization, vol. 5, 2nd edn, pp. 2897–2988. Springer, Berlin (2013)
van der Laan, G., Talman, D., Yang, Z.: Solving discrete systems of nonlinear equations. Eur. J. Oper. Res. 214, 493–500 (2011)
Moriguchi, S., Murota, K., Shioura, A.: Scaling algorithms for M-convex function minimization. In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E85–A, pp. 922–929 (2002)
Moriguchi, S., Murota, K., Tamura, A., Tardella, F.: Discrete midpoint convexity. arXiv:1708.04579 (2017)
Moriguchi, S., Shioura, A., Tsuchimura, N.: M-convex function minimization by continuous relaxation approach—proximity theorem and algorithm. SIAM J. Optim. 21, 633–668 (2011)
Moriguchi, S., Tsuchimura, N.: Discrete L-convex function minimization based on continuous relaxation. Pac. J. Optim. 5, 227–236 (2009)
Murota, K.: Discrete convex analysis. Math. Program. 83, 313–371 (1998)
Murota, K.: Discrete Convex Analysis. SIAM, Philadelphia (2003)
Murota, K.: Recent developments in discrete convex analysis. In: Cook, W., Lovász, L., Vygen, J. (eds.) Research Trends in Combinatorial Optimization, pp. 219–260. Springer, Berlin (2009)
Murota, K.: Discrete convex analysis: a tool for economics and game theory. J. Mech. Inst. Des. 1, 151–273 (2016)
Murota, K., Tamura, A.: Proximity Theorems of Discrete Convex Functions. RIMS Preprint 1358, Kyoto University (2002)
Murota, K., Tamura, A.: Proximity theorems of discrete convex functions. Math. Program. 99, 539–562 (2004)
Queyranne, M., Tardella, F.: Bimonotone linear inequalities and sublattices of \({\mathbb{R}}^{n}\). Linear Algebra Appl. 413, 100–120 (2006)
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1986)
Shioura, A.: Fast scaling algorithms for M-convex function minimization with application to the resource allocation problem. Discrete Appl. Math. 134, 303–316 (2004)
Tamir, A.: A strongly polynomial algorithm for minimum convex separable quadratic cost flow problems on series–parallel networks. Math. Program. 59, 117–132 (1993)
Tamir, A.: New pseudopolynomial complexity bounds for the bounded and other integer Knapsack related problems. Oper. Res. Lett. 37, 303–306 (2009)
Tamura, A.: Coordinatewise domain scaling algorithm for M-convex function minimization. Math. Program. 102, 339–354 (2005)
Yang, Z.: On the solutions of discrete nonlinear complementarity and related problems. Math. Oper. Res. 33, 976–990 (2008)
Yang, Z.: Discrete fixed point analysis and its applications. J. Fixed Point Theory Appl. 6, 351–371 (2009)
Acknowledgements
The authors thank Yoshio Okamoto for communicating a relevant reference and two anonymous referees for detailed comments. This research was initiated at the Trimester Program “Combinatorial Optimization” at Hausdorff Institute of Mathematics, 2015. This work was supported by The Mitsubishi Foundation, CREST, JST, Grant Number JPMJCR14D2, Japan, and JSPS KAKENHI Grant Numbers JP26350430, JP26280004, JP16K00023, JP17K00037.
Author information
Authors and Affiliations
Corresponding author
Additional information
The extended abstract of this paper is included in the Proceedings of the 27th International Symposium on Algorithms and Computation (ISAAC), Sydney, December 12–14, 2016. Leibniz International Proceedings in Informatics (LIPIcs), 64 (2016), 57:1–57:12, Dagstuhl Publishing.
Appendices
Appendix A: An alternative Proof of Theorem 2.4
Here is a proof of Theorem 2.4 (local characterization of integral convexity) that is shorter than the original proof in [2] and valid for functions defined on general integrally convex sets rather than discrete rectangles.
Obviously, (a) implies (b). The proof for the converse, (b) \(\Rightarrow \) (a), is given by the following two lemmas, where integral convexity of \(\mathrm{dom\,}f\) and condition (b) are assumed.
Lemma A.1
Let \(B \subseteq {{\mathbb {R}}}^{n}\) be a box of size two with integer vertices, i.e., \(B = [ {a}, {a} + 2 \varvec{1} ]_{{{\mathbb {R}}}}\) for some \({a} \in {{\mathbb {Z}}}^{n}\). Then \({\tilde{f}}\) is convex on \(B \cap \overline{\mathrm{dom\,}f}\).
Proof
First, the assumed integral convexity of \(\mathrm{dom\,}f\) implies that \(B \cap \overline{\mathrm{dom\,}f} = \overline{B \cap \mathrm{dom\,}f}\) and that every point in \(B \cap \overline{\mathrm{dom\,}f}\) can be represented as a convex combination of points in \(B \cap \mathrm{dom\,}f\). We may assume \(B = [ \varvec{0}, 2 \varvec{1} ]_{{{\mathbb {R}}}}\). To prove by contradiction, assume that there exist \(x \in B \cap \overline{\mathrm{dom\,}f}\) and \(y^{1},\ldots , y^{m} \in B \cap \mathrm{dom\,}f\) such that
where \(\sum _{i=1}^{m} \lambda _{i} = 1\) and \(\lambda _{i} > 0 \ (i=1,\ldots , m)\). We may also assume \(x \in [ \varvec{0}, \varvec{1} ]_{{{\mathbb {R}}}}\) without loss of generality. For each \(j=1,\ldots , n\), we look at the j-th component of the generating points \(y^{i}\) to define
Since \( x_{j} = \sum _{i=1}^{m} \lambda _{i} y^{i}_{j} \le 1\), if \(I_{j}^{2} \not = \emptyset \), then \(I_{j}^{0} \not = \emptyset \).
Let \(j=n\) and suppose that \(I_{n}^{2} \not = \emptyset \). Then \(I_{n}^{0} \not = \emptyset \). We may assume \(y^{1}_{n} = 0\), \(y^{2}_{n} = 2\); \(\lambda _{1} > 0\), \(\lambda _{2} > 0\). By (2.2) for \((y^{1}, y^{2})\) and the definition of \({\tilde{f}}\) we have
where
with \(\mu _{k} > 0\)\((k=1,\ldots , l)\) and \(\sum _{k=1}^{l} \mu _{k} = 1\). This implies, with notation \(\lambda = \min (\lambda _{1}, \lambda _{2})\), that
Hence
Since
we have obtained another representation of the form (A.1). With reference to this new representation define \({\hat{I}}_{n}^{0}\) (resp., \({\hat{I}}_{n}^{2}\)) to be the set of indices of the generators whose n-th component is equal to 0 (resp., 2). Since \(z^{k}_{n} = 1\) for all k as a consequence of (A.2) with \((y^{1}_{n}+ y^{2}_{n})/2 = (0 + 2)/2 = 1\), we have \({\hat{I}}_{n}^{0} \subseteq I_{n}^{0}\), \({\hat{I}}_{n}^{2} \subseteq I_{n}^{2}\) and \(|{\hat{I}}_{n}^{0}| + |{\hat{I}}_{n}^{2}| \le |I_{n}^{0}| + |I_{n}^{2}| - 1\).
By repeating the above process with \(j=n\), we eventually arrive at a representation of the form of (A.1) with \(I_{n}^{2} = \emptyset \), which means that \(y^{i}_{n} \in \{ 0,1 \}\) for all generators \(y^{i}\).
Then we repeat the above process for \(j=n-1,n-2, \ldots ,1\), to obtain a representation of the form of (A.1) with \(y^{i} \in [ \varvec{0}, \varvec{1} ]_{{{\mathbb {Z}}}}\) for all generators \(y^{i}\). This contradicts the definition of \({\tilde{f}}\). \(\square \)
Lemma A.2
For any \(x, y \in \overline{\mathrm{dom\,}f}\), \({\tilde{f}}\) is convex on the line segment connecting x and y.
Proof
Let L denote the (closed) line segment connecting x and y, and consider the boxes B, as in Lemma A.1, that intersect L. There exists a finite number of such boxes, say, \(B_{1}, \ldots , B_{m}\), and L is covered by the line segments \(L_{j} = L \cap B_{j}\)\((j=1,\ldots , m)\). That is, \( L = \bigcup _{j=1}^{m} L_{j}\). For each point \(z \in L {\setminus } \{ x, y \}\), there exists some \(L_{j}\) that contains z in its interior. Since \(L_{j} \subseteq L \subseteq \overline{\mathrm{dom\,}f}\), \({\tilde{f}}\) is convex on \(L_{j}\) by Lemma A.1. HenceFootnote 4\({\tilde{f}}\) is convex on L. \(\square \)
Appendix B: Proof of Proposition 5.3
It is known (cf. [29, proof of Theorem 16.4]) that the set of integer vectors contained in
forms a Hilbert basis of \({\tilde{C}}_{A}\). Let z be an integer vector in \(F_{A}\). That is, \(z \in {{\mathbb {Z}}}^{n}\) and
for some \(\mu _{i}^{+}, \mu _{i}^{-} \in [0,1]_{{{\mathbb {R}}}}\;(i \in A)\); \(\mu _{i}^{\circ } \in [0,1]_{{{\mathbb {R}}}}\; (i \in N {\setminus } A)\); \(\lambda \in [0,1]_{{{\mathbb {R}}}}\). Our goal is to show that z can be represented as a nonnegative integer combination of vectors in \(B_{A}\).
First note that \(\mu _{i}^{\circ } \in \{ 0, 1 \}\) for each \(i \in N {\setminus } A\); define \(A^{\circ } = \{ i \in N {\setminus } A \mid \mu _{i}^{\circ }=1 \}\). We denote the coefficient of \(\varvec{1}_{A}\) in (B.2) as
and divide into cases according to whether \(\xi \) is an integer or not.
Case 1 (\(\xi \in {{\mathbb {Z}}}\)): Using \(\xi \) we rewrite (B.2) as
in which \(\xi \) is an integer. For each \(i \in A\), \(\mu _{i}^{+}-\mu _{i}^{-}\) must be an integer, which is equal to 0, 1 or \(-1\). Accordingly we define
to rewrite (B.1) as
Here the coefficient of \(\varvec{1}_{A}\) is integral, since
Hence (B.3) gives a representation of z as a nonnegative integer combination of vectors in \(B_{A}\).
Case 2 (\(\xi \not \in {{\mathbb {Z}}}\)): Let \(\eta \) denote the fractional part of \(\xi \), i.e., \(\eta = \xi - \lfloor \xi \rfloor \) with \(0< \eta < 1\). We rewrite (B.2) as
For each \(i \in A\), \(\mu _{i}^{+}-\mu _{i}^{-} + \eta \) must be an integer, which is equal to 1 or 0. Accordingly we define
Then
which follows from
In the case of \(|A^{+}| \le |A^{-}|\), we see from (B.4) that
which is a nonnegative integer combination of vectors in \(B_{A}\). In the other case with \(|A^{+}| > |A^{-}|\), we have an alternative expression
which is also a nonnegative integer combination of vectors in \(B_{A}\). This completes the proof of Proposition 5.3.
Rights and permissions
About this article
Cite this article
Moriguchi, S., Murota, K., Tamura, A. et al. Scaling, proximity, and optimization of integrally convex functions. Math. Program. 175, 119–154 (2019). https://doi.org/10.1007/s10107-018-1234-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1234-z