1 Introduction

The proximity-scaling approach is a fundamental technique in designing efficient algorithms for discrete or combinatorial optimization. For a function \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) in integer variables and a positive integer \(\alpha \), called a scaling unit, the \(\alpha \)-scaling of f means the function \(f^{\alpha }\) defined by \(f^{\alpha }(x) = f(\alpha x) \)\((x \in {\mathbb {Z}}^{n})\). A proximity theorem is a result guaranteeing that a (local) minimum of the scaled function \(f^{\alpha }\) is close to a minimizer of the original function f. The scaled function \(f^{\alpha }\) is simpler and hence easier to minimize, whereas the quality of the obtained minimizer of \(f^{\alpha }\) as an approximation to the minimizer of f is guaranteed by a proximity theorem. The proximity-scaling approach consists in applying this idea for a decreasing sequence of \(\alpha \), often by halving the scaling unit \(\alpha \). A generic form of a proximity-scaling algorithm may be described as follows, where \(K_{\infty }\ (> 0)\) denotes the \(\ell _{\infty }\)-size of the effective domain \(\mathrm{dom\,}f = \{ x \in {{\mathbb {Z}}}^{n} \mid f(x) < +\,\infty \}\) and \(B(n,\alpha )\) denotes the proximity bound in \(\ell _{\infty }\)-distance for \(f^{\alpha }\).

figure a

The algorithm consists of \(O(\log _{2} K_{\infty })\) scaling phases. This approach has been particularly successful for resource allocation problems [8,9,10, 16] and for convex network flow problems (under the name of “capacity scaling”) [1, 14, 15]. Different types of proximity theorems have also been investigated: proximity between integral and real optimal solutions [9, 31, 32], among others. For other types of algorithms of nonlinear integer optimization, see, e.g., [5].

In discrete convex analysis [22,23,24,25], a variety of discrete convex functions are considered. A separable convex function is a function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) that can be represented as \(f(x) = \varphi _{1}(x_{1}) + \cdots + \varphi _{n}(x_{n})\), where \(x=(x_{1}, \ldots ,x_{n})\), with univariate discrete convex functions \(\varphi _{i}: {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) satisfying \(\varphi _{i}(t-1) + \varphi _{i}(t+1) \ge 2 \varphi _{i}(t)\) for all \(t \in {{\mathbb {Z}}}\).

A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called integrally convex if its local convex extension \({\tilde{f}}: {{\mathbb {R}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is (globally) convex in the ordinary sense, where \({\tilde{f}}\) is defined as the collection of convex extensions of f in each unit hypercube \(\{ x \in {{\mathbb {R}}}^{n} \mid a_{i} \le x_{i} \le a_{i} + 1 \ (i=1,\ldots , n) \}\) with \(a \in {{\mathbb {Z}}}^{n}\); see Sect. 2 for precise statements.

A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called L\(^{\natural }\)-convex if it satisfies one of the equivalent conditions in Theorem 1.1 below. For \(x,y \in {{\mathbb {Z}}}^{n}\), \(x \vee y\) and \(x \wedge y\) denote the vectors of component wise maximum and minimum of x and y, respectively. Discrete midpoint convexity of f for \(x,y \in {{\mathbb {Z}}}^{n}\) means

$$\begin{aligned} f(x) + f(y) \ge f \left( \left\lceil \frac{x+y}{2} \right\rceil \right) + f \left( \left\lfloor \frac{x+y}{2} \right\rfloor \right) , \end{aligned}$$
(1.1)

where \(\lceil \cdot \rceil \) and \(\lfloor \cdot \rfloor \) denote the integer vectors obtained by componentwise rounding-up and rounding-down to the nearest integers, respectively. We use the notation \(\varvec{1}=(1,1,\ldots , 1)\) and \(\varvec{1}_{i}\) for the i-th unit vector \((0,\ldots ,0, \overset{\overset{i}{\vee }}{1},0,\ldots ,0)\), with the convention \(\varvec{1}_{0}=\varvec{0}\).

Theorem 1.1

([2, 4, 23]) For \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) the following conditions, (a) to (d), are equivalentFootnote 1:

  1. (a)

    f is integrally convex and submodular:

    $$\begin{aligned} f(x) + f(y) \ge f(x \vee y) + f(x \wedge y) \qquad (x, y \in {{\mathbb {Z}}}^{n}). \end{aligned}$$
    (1.2)
  2. (b)

    f satisfies discrete midpoint convexity (1.1) for all \( x, y \in {{\mathbb {Z}}}^{n}\).

  3. (c)

    f satisfies discrete midpoint convexity (1.1) for all \( x, y \in {{\mathbb {Z}}}^{n}\) with \(\Vert x-y \Vert _{\infty } \le 2\), and the effective domain has the property: \(x, y \in \mathrm{dom\,}f \Rightarrow \left\lceil (x+y)/2 \right\rceil , \left\lfloor (x+y)/2 \right\rfloor \in \mathrm{dom\,}f \).

  4. (d)

    f satisfies translation-submodularity:

    $$\begin{aligned} f(x) + f(y)\ge f((x - \mu \mathbf{1}) \vee y) + f(x \wedge (y + \mu \mathbf{1}))\qquad (x, y \in {{\mathbb {Z}}}^{n}, \ 0 \le \mu \in {{\mathbb {Z}}}). \nonumber \\ \end{aligned}$$
    (1.3)

A simple example to illustrate the difference between integrally convex and L\(^{\natural }\)-convex functions can be provided in the case of quadratic functions. Indeed, for an \(n \times n\) symmetric matrix Q and a vector \(p \in {{\mathbb {R}}}^n\), the function \(f(x) = x^{\top } Q x + p^{\top } x\) is integrally convex whenever Q is diagonally dominant with nonnegative diagonal elements, i.e., \(q_{ii} \ge \sum _{j \not = i} |q_{ij}|\) for \(i=1,\ldots ,n\) [2]. On the other hand, f is L\(^{\natural }\)-convex if and only if it is diagonally dominant with nonnegative diagonal elements and \(q_{ij} \le 0\) for all \(i \ne j\) [23, Section 7.3].

A function \(f: {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) is called M\(^{\natural }\)-convex if it satisfies an exchange property: For any \(x, y \in \mathrm{dom\,}f\) and any \(i \in \mathrm{supp}^{+}(x-y)\), there exists \(j \in \mathrm{supp}^{-}(x-y) \cup \{ 0 \}\) such that

$$\begin{aligned} f(x)+f(y) \ge f(x - \varvec{1}_{i} + \varvec{1}_{j}) + f(y + \varvec{1}_{i} - \varvec{1}_{j}), \end{aligned}$$
(1.4)

where, for \(z \in {{\mathbb {Z}}}^{n}\), \(\mathrm{supp}^{+}(z) = \{ i \mid z_{i} > 0 \}\) and \(\mathrm{supp}^{-}(z) = \{ j \mid z_{j} < 0 \}\). It is known (and easy to see) that a function is separable convex if and only if it is both \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex.

Integrally convex functions constitute a common framework for discrete convex functions, including separable convex, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions as well as \(\mathrm{L}^{\natural }_{2}\)-convex and \(\mathrm{M}^{\natural }_{2}\)-convex functions [23], and BS-convex and UJ-convex functions [3]. The concept of integral convexity is used in formulating discrete fixed point theorems [11, 12, 35], and designing solution algorithms for discrete systems of nonlinear equations [17, 34]. In game theory the integral concavity of payoff functions guarantees the existence of a pure strategy equilibrium in finite symmetric games [13].

The scaling operation preserves \(\mathrm{L}^{\natural }\)-convexity, that is, if f is \(\mathrm{L}^{\natural }\)-convex, then \(f^{\alpha }\) is \(\mathrm{L}^{\natural }\)-convex. \(\mathrm{M}^{\natural }\)-convexity is subtle in this respect: for an \(\mathrm{M}^{\natural }\)-convex function f, \(f^{\alpha }\) remains \(\mathrm{M}^{\natural }\)-convex if \(n \le 2\), while this is not always the case if \(n\ge 3\).

Example 1.1

Here is an example to show that \(\mathrm{M}^{\natural }\)-convexity is not preserved under scaling. Let f be the indicator function of the set \(S = \{ c_{1} (1,0,-1) +c_{2} (1,0,0) + c_{3} (0,1,-1) + c_{4} (0,1,0) \mid c_{i} \in \{ 0,1 \} \ (i=1,2,3,4) \} \subseteq {{\mathbb {Z}}}^{3}\). Then f is an \(\mathrm{M}^{\natural }\)-convex function, but \(f^{2}\) (=\(f^{\alpha }\) with \(\alpha =2\)), being the indicator function of \( \{ (0,0,0), (1,1,-1) \}\), is not \(\mathrm{M}^{\natural }\)-convex. This example is a reformulation of [23, Note 6.18] for M-convex functions to \(\mathrm{M}^{\natural }\)-convex functions.

It is rather surprising that nothing is known about scaling for integrally convex functions. Example 1.1 does not demonstrate the lack of scaling property of integrally convex functions, since \(f^{2}\) above is integrally convex, though not \(\mathrm{M}^{\natural }\)-convex.

As for proximity theorems, the following facts are known for separable convex, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions. In the following three theorems we assume that \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\), \(\alpha \) is a positive integer, and \(x^{\alpha } \in \mathrm{dom\,}f\). It is noteworthy that the proximity bound is independent of n for separable convex functions, and coincides with \(n (\alpha -1)\), which is linear in n, for \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions.

Theorem 1.2

Suppose that f is a separable convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ \varvec{1}_{i}, -\varvec{1}_{i} \ (1 \le i \le n) \} \), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le \alpha -1\).

Proof

The statement is obviously true if \(n=1\). Then the statement for general n follows easily from the fact that \(x^{*}\) is a minimizer of \(f(x) = \varphi _{1}(x_{1}) + \cdots + \varphi _{n}(x_{n})\) if and only if, for each i, \(x^{*}_{i}\) is a minimizer of \(\varphi _{i}\). \(\square \)

Theorem 1.3

([15]; [23, Theorem 7.18]) Suppose that f is an \(\mathrm{L}^{\natural }\)-convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ 0, 1 \}^{n} \cup \{ 0, -1 \}^{n}\), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le n (\alpha -1)\).

Theorem 1.4

([18]; [23, Theorem 6.37]) Suppose that f is an \(\mathrm{M}^{\natural }\)-convex function. If \(f(x^{\alpha }) \le f(x^{\alpha } + \alpha d)\) for all \(d \in \{ \varvec{1}_{i}, -\varvec{1}_{i} \ (1 \le i \le n), \ \varvec{1}_{i} - \varvec{1}_{j} \ (i \not = j) \} \), then there exists a minimizer \(x^{*}\) of f with \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le n (\alpha -1)\).

Based on the above results and their variants, efficient algorithms for minimizing \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions have been successfully designed with the proximity-scaling approach [18, 20, 21, 23, 30, 33]. Proximity theorems are also available for \(\mathrm{L}^{\natural }_{2}\)-convex and \(\mathrm{M}^{\natural }_{2}\)-convex functions [27] and L-convex functions on graphs [6, 7]. However, no proximity theorem has yet been proved for integrally convex functions.

The new findings of this paper are

  • A “box-barrier property” (Theorem 2.6), which allows us to restrict the search for a global minimum of an integrally convex function;

  • Stability of integral convexity under scaling when \(n = 2\) (Theorem 3.2), and an example to demonstrate its failure when \(n \ge 3\) (Example 3.1);

  • A proximity theorem with a superexponential bound \(\displaystyle [(n+1)!/ 2^{n-1}](\alpha -1)\) for all n (Theorem 5.1), and the impossibility of finding a proximity bound of the form \(B(n)(\alpha -1)\) where B(n) is linear or smaller than quadratic (Examples 4.4 and 4.5).

As a consequence of our proximity and scaling results, we derive that:

  • When n is fixed, an integrally convex function can be minimized in \(O(\log _{2} K_{\infty })\) time by standard proximity-scaling algorithms, where \(K_{\infty } = \max \{ \Vert x -y \Vert _{\infty } \mid x, y \in \mathrm{dom\,}f \}\) denotes the \(\ell _{\infty }\)-size of \(\mathrm{dom\,}f\).

This paper is organized as follows. In Sect. 2 the concept of integrally convex functions is reviewed with some new observations and, in Sect. 3, their scaling property is clarified. After a preliminary discussion in Sect. 4, a proximity theorem for integrally convex functions is established in Sect. 5. Algorithmic implications of the proximity-scaling results are discussed in Sect. 6 and concluding remarks are made in Sect. 7.

2 Integrally convex sets and functions

For \(x \in {{\mathbb {R}}}^{n}\) the integer neighborhood of x is defined as

$$\begin{aligned} N(x) = \left\{ z \in {\mathbb {Z}}^{n} \mid | x_{i} - z_{i} | < 1 \ (i=1,\ldots ,n) \right\} . \end{aligned}$$

For a function \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) the local convex extension \({\tilde{f}}: {{\mathbb {R}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) of f is defined as the union of all convex envelopes of f on N(x) as follows:

$$\begin{aligned} {{\tilde{f}}}(x) = \min \left\{ \sum _{y \in N(x)} \lambda _{y} f(y) \mid \sum _{y \in N(x)} \lambda _{y} y = x, (\lambda _{y}) \in \Lambda (x) \right\} \quad (x \in {{\mathbb {R}}}^{n}),\qquad \quad \end{aligned}$$
(2.1)

where \(\Lambda (x)\) denotes the set of coefficients for convex combinations indexed by N(x):

$$\begin{aligned} \Lambda (x) = \left\{ (\lambda _{y} \mid y \in N(x) ) \mid \sum _{y \in N(x)} \lambda _{y} = 1, \lambda _{y} \ge 0 \ (\forall y \in N(x)) \right\} . \end{aligned}$$

If \({{\tilde{f}}}\) is convex on \({{\mathbb {R}}}^{n}\), then f is said to be integrally convex [2]. A set \(S \subseteq {{\mathbb {Z}}}^{n}\) is said to be integrally convex if the convex hull \({\overline{S}}\) of S coincides with the union of the convex hulls of \(S \cap N(x)\) over \(x \in {{\mathbb {R}}}^{n}\), i.e., if, for any \(x \in {{\mathbb {R}}}^{n}\), \(x \in {\overline{S}} \) implies \(x \in \overline{S \cap N(x)}\). A set \(S \subseteq {{\mathbb {Z}}}^{n}\) is integrally convex if and only if its indicator function is an integrally convex function. The effective domain and the set of minimizers of an integrally convex function are both integrally convex [23, Proposition 3.28]; in particular, the effective domain and the set of minimizers of an L\(^{\natural }\)- or M\(^{\natural }\)-convex function are integrally convex.

For \(n=2\), integrally convex sets are illustrated in Fig. 1 and their structure is described in the next proposition.

Fig. 1
figure 1

Concept of integrally convex sets. a Integrally convex, b not integrally convex, c not integrally convex

Proposition 2.1

A set \(S \subseteq {\mathbb {Z}}^{2}\) is an integrally convex set if and only if it can be represented as \(S = \{ (x_{1},x_{2}) \in {\mathbb {Z}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) for some \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m)\).

Proof

Consider the convex hull \({\overline{S}}\) of S, and denote the (shifted) unit square \(\{ (x_{1}, x_{2}) \in {{\mathbb {R}}}^{2} \mid a_{i} \le x_{i} \le a_{i}+1 \ (i=1,2) \}\) by \(I(a_{1}, a_{2})\), where \((a_{1}, a_{2}) \in {{\mathbb {Z}}}^{2}\). Let S be an integrally convex set. It follows from the definition that \({\overline{S}} \cap I(a_{1}, a_{2}) = \overline{S \cap I(a_{1}, a_{2})}\) for each \((a_{1}, a_{2}) \in {{\mathbb {Z}}}^{2}\). Obviously, \(\overline{S \cap I(a_{1}, a_{2})}\) can be described by (at most four) inequalities \(p'_{j} x_{1} + q'_{j} x_{2} \le r'_{j} \ (j=1,\ldots ,\ell ')\) with \(p'_{j}, q'_{j} \in \{ -1,0,+1 \}\) and \(r'_{j} \in {{\mathbb {Z}}}\)\((j=1,\ldots ,\ell ')\), where \(\ell ' = \ell '(a_{1}, a_{2}) \le 4\). Since \({\overline{S}}\) is the union of sets \({\overline{S}} \cap I(a_{1}, a_{2})\), \({\overline{S}}\) can be represented as \(\{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) by a subfamily of the inequalities used for all \(\overline{S \cap I(a_{1}, a_{2})}\). Then we have \(S = \{ (x_{1},x_{2}) \in {\mathbb {Z}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\). Conversely, integral convexity of set S represented in this form for any \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\) is an easy consequence of the simple shape of the (possibly unbounded) polygon \( \{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\), which has at most eight edges having directions parallel to one of the vectors (1, 0), (0, 1), (1, 1), \((1,-1)\). \(\square \)

We note that in the special case where all inequalities \(p_{i} x_{1} + q_{i} x_{2} \le r_{i}\)\((i=1,\ldots ,m)\) defining S in Proposition 2.1 satisfy the additional property \(p_{i} q_{i} \le 0\), the set S is actually an \(\mathrm{L}^{\natural }\)-convex set [23, Section 5.5], which is a special type of sublattice [28].

Remark 2.1

A subtle point in Proposition 2.1 is explained here. In Proposition 2.1 we do not mean that the system of inequalities for S describes the convex hull \({\overline{S}}\) of S. That is, it is not claimed that \({\overline{S}} = \{ (x_{1},x_{2}) \in {\mathbb {R}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) holds. For instance, \(S = \{ (0,0), (1,0) \}\) is an integrally convex set, which can be represented as the set of integer points satisfying the four inequalities: \( -x_{1} + x_{2} \le 0\), \( x_{1} - x_{2} \le 1\), \( x_{1} + x_{2} \le 1\), and \( -x_{1} - x_{2} \le 0\). These inequalities, however, do not describe the convex hull \({\overline{S}}\), which is the line segment connecting (0, 0) and (1, 0). Nevertheless, it is true in general (cf. the proof of Proposition 2.1) that the convex hull of an integrally convex set can be described by inequalities of the form of \(p'_{i} x_{1} + q'_{i} x_{2} \le r'_{i}\) with \(p'_{i}, q'_{i} \in \{ -1,0,+1 \}\) and \(r'_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m')\). For \(S = \{ (0,0), (1,0) \}\) we can describe \({\overline{S}}\) by adding two inequalities \(x_{2} \le 0\) and \(- \,x_{2} \le 0\) to the original system of four inequalities. The present form of Proposition 2.1, avoiding the convex hull, is convenient in the proof of Proposition 3.1.

Corollary 2.2

If a set \(S \subseteq {{\mathbb {Z}}}^2\) is integrally convex, then for all points \(x,y \in S\), the set

is contained in S.

Proof

Let S be represented as in Proposition 2.1 and let \(x,y \in S\). Then we clearly have \( \max \{p_{i} x_{1} + q_{i} x_{2}, p_{i} y_{1} + q_{i} y_{2} \} \le r_{i}\)\((i=1,\ldots ,m)\). The claim follows by observing that \(\max \{p_{i} x_{1} + q_{i} x_{2}, p_{i} y_{1} + q_{i} y_{2} \}\) coincides with one of \(\max \{x_{i}, y_{i} \}\), \(\max \{-\,x_{i}, -\,y_{i} \}\)\((i=1,2)\), \(\max \{x_{1}-x_{2}, y_{1}- y_{2} \}\), \(\max \{x_{1}+x_{2}, y_{1}+ y_{2} \}\), \(\max \{-\,x_{1}+x_{2}, -\,y_{1}+ y_{2} \}\), \(\max \{-\,x_{1}-x_{2}, -\,y_{1}-y_{2} \}\), according to the values of \(p_{i}, q_{i} \in \{ -1,0,+1 \}\). \(\square \)

Note that \(\mathrm{ICH}(x,y)\) is integrally convex by Proposition 2.1, and that, by the above corollary, any integrally convex set containing \(\{x,y \}\) must contain \(\mathrm{ICH}(x,y)\). Thus \(\mathrm{ICH}(x,y)\) is the smallest integrally convex set containing \(\{x,y \}\).

Integral convexity is preserved under the operations of origin shift, permutation of components, and componentwise (individual) sign inversion. For later reference we state these facts as a proposition.

Proposition 2.3

Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function.

  1. (1)

    For any \(z \in {{\mathbb {Z}}}^{n}\), \(f(z + x)\) is integrally convex in x.

  2. (2)

    For any permutation \(\sigma \) of \((1,2,\ldots ,n)\), \(f(x_{\sigma (1)}, x_{\sigma (2)}, \ldots , x_{\sigma (n)})\) is integrally convex in x.

  3. (3)

    For any \(s_{1}, s_{2}, \ldots , s_{n} \in \{ +1, -1 \}\), \(f(s_{1} x_{1}, s_{2} x_{2}, \ldots , s_{n} x_{n})\) is integrally convex in x.

Proof

The claims (1) to (3) follow easily from the definition of integrally convex functions and the obvious relations: \(N(z + x) = \{ z + y \mid y \in N(x) \}\), \( N((x_{\sigma (1)}, \ldots , x_{\sigma (n)})) = \{ (y_{\sigma (1)}, \ldots , y_{\sigma (n)}) \mid y \in N(x) \}\), and \(N((s_{1} x_{1}, \ldots , s_{n} x_{n})) = \{ (s_{1} y_{1}, \ldots , s_{n} y_{n}) \mid y \in N(x) \}\). \(\square \)

Integral convexity of a function can be characterized by a local condition under the assumption that the effective domain is an integrally convex set. The following theorem is proved in [2] when the effective domain is an integer interval (discrete rectangle). An alternative proof, which is also valid for the general case, is given in “Appendix A”.

Theorem 2.4

([2, Proposition 3.3]) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be a function with an integrally convex effective domain. Then the following properties are equivalent:

  1. (a)

    f is integrally convex.

  2. (b)

    For every \(x, y \in \mathrm{dom\,}f\) with \(\Vert x - y \Vert _{\infty } =2\) we have

    $$\begin{aligned} {\tilde{f}}\, \bigg (\frac{x + y}{2} \bigg ) \le \frac{1}{2} (f(x) + f(y)). \end{aligned}$$
    (2.2)

Theorem 2.5

([2, Proposition 3.1]; see also [23, Theorem 3.21]) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function and \(x^{*} \in \mathrm{dom\,}f\). Then \(x^{*}\) is a minimizer of f if and only if \(f(x^{*}) \le f(x^{*} + d)\) for all \(d \in \{ -1, 0, +1 \}^{n}\).

The local characterization of global minima stated in Theorem 2.5 above can be generalized to the following form; see Fig. 2.

Fig. 2
figure 2

Box-barrier property (\(\circ \in S\), \(\bullet \in W\))

Theorem 2.6

(Box-barrier property) Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, and let \(p \in ({\mathbb {Z}} \cup \{ -\,\infty \})^{n}\) and \(q \in ({\mathbb {Z}} \cup \{ +\,\infty \})^{n}\), where \(p \le q\). Define

$$\begin{aligned} S&= \{ x \in {\mathbb {Z}}^{n} \mid p_{i}< x_{i} < q_{i} \quad (i=1,\ldots ,n) \}, \\ W_{i}^{+}&= \{ x \in {\mathbb {Z}}^{n} \mid x_{i} = q_{i}, \ p_{j} \le x_{j} \le q_{j} \ (j \not = i) \} \quad (i=1,\ldots ,n), \\ W_{i}^{-}&= \{ x \in {\mathbb {Z}}^{n} \mid x_{i} = p_{i}, \ p_{j} \le x_{j} \le q_{j} \ (j \not = i) \} \quad (i=1,\ldots ,n), \end{aligned}$$

and \(W = \bigcup _{i=1}^{n} (W_{i}^{+} \cup W_{i}^{-})\). Let \({{\hat{x}}} \in S \cap \mathrm{dom\,}f\). If \(f({{\hat{x}}}) \le f(y)\) for all \(y \in W\), then \(f({{\hat{x}}}) \le f(z)\) for all \(z \in {{\mathbb {Z}}}^{n} {\setminus } S\).

Proof

Let \(U = \bigcup _{i=1}^{n} \{ x \in {\mathbb {R}}^{n} \mid x_{i} \in \{ p_{i}, q_{i} \}, \ p_{j} \le x_{j} \le q_{j} \ (j \not = i) \}\), for which we have \(U \cap {\mathbb {Z}}^{n} = W\). For a point \(z \in {{\mathbb {Z}}}^{n} {\setminus } S\), the line segment connecting \({{\hat{x}}}\) and z intersects U at a point, say, \(u \in {{\mathbb {R}}}^{n}\). Then its integral neighborhood N(u) is contained in W. Since the local convex extension \({\tilde{f}}(u)\) is a convex combination of the f(y)’s with \(y \in N(u)\), and \(f(y) \ge f({{\hat{x}}})\) for every \(y \in W\), we have \({\tilde{f}}(u) \ge f({{\hat{x}}})\). On the other hand, it follows from integral convexity that \({\tilde{f}}(u) \le (1 - \lambda ) f({{\hat{x}}}) + \lambda f(z)\) for some \(\lambda \) with \(0 < \lambda \le 1\). Hence \(f({{\hat{x}}}) \le {\tilde{f}}(u) \le (1 - \lambda ) f({{\hat{x}}}) + \lambda f(z)\), and therefore, \(f({{\hat{x}}}) \le f(z)\). \(\square \)

Theorem 2.5 is a special case of Theorem 2.6 with \(p={{\hat{x}}} - \varvec{1}\) and \(q={{\hat{x}}} + \varvec{1}\). Another special case of Theorem 2.6 with \(p_{j} = -\,\infty \)\((j=1,\ldots ,n)\) and \(q_{j} = +\,\infty \)\((j \not = i)\) for a particular i takes the following form, which we use in Sect. 5.3.

Corollary 2.7

(Hyperplane-barrier property). Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function. Let \({{\hat{x}}} \in \mathrm{dom\,}f\), \(q \in {{\mathbb {Z}}}\), and let i be an integer with \(1 \le i \le n\). If \({{\hat{x}}}_{i} < q\) and \(f({{\hat{x}}}) \le f(y)\) for all \(y \in {{\mathbb {Z}}}^{n}\) with \(y_{i} = q\), then \(f({{\hat{x}}}) \le f(z)\) for all \(z \in {{\mathbb {Z}}}^{n}\) with \(z_{i} \ge q\).

We denote the sets of nonnegative integers and positive integers by \({{\mathbb {Z}}}_{+}\) and \({{\mathbb {Z}}}_{++}\), respectively. For \(\alpha \in {{\mathbb {Z}}}\) we write \(\alpha {{\mathbb {Z}}}\) for \(\{ \alpha x \mid x \in {{\mathbb {Z}}}\}\). For vectors \(a, b \in {{\mathbb {R}}}^{n}\) with \(a \le b\), \([a,b]_{{{\mathbb {R}}}}\) denotes the interval between a and b, i.e., \([a,b]_{{{\mathbb {R}}}} = \{ x \in {{\mathbb {R}}}^{n} \mid a \le x \le b \}\), and \([a,b]_{{{\mathbb {Z}}}}\) the integer interval between a and b, i.e., \([a,b]_{{{\mathbb {Z}}}} = \{ x \in {{\mathbb {Z}}}^{n} \mid a \le x \le b \}\).

3 The scaling operation for integrally convex functions

In this section we consider the scaling operation for integrally convex functions. Recall that, for \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) and \(\alpha \in {\mathbb {Z}}_{++}\), the \(\alpha \)-scaling of f is defined to be the function \(f^{\alpha }: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) given by \(f^{\alpha }(x) = f(\alpha x) \)\((x \in {\mathbb {Z}}^{n})\).

When \(n = 2\), integral convexity is preserved under scaling. We first deal with integrally convex sets.

Proposition 3.1

Let \(S \subseteq {\mathbb {Z}}^{2}\) be an integrally convex set and \(\alpha \in {\mathbb {Z}}_{++}\). Then \(S^{\alpha } = \{ x \in {\mathbb {Z}}^{2} \mid \alpha x \in S \}\) is an integrally convex set.

Proof

By Proposition 2.1 we can assume that S is represented as \(S = \{ (x_1,x_2) \in {{\mathbb {Z}}}^{2} \mid p_{i} x_{1} + q_{i} x_{2} \le r_{i} \ (i=1,\ldots ,m) \}\) for some \(p_{i}, q_{i} \in \{ -1,0,+1 \}\) and \(r_{i} \in {{\mathbb {Z}}}\)\((i=1,\ldots ,m)\). Since \((y_1, y_2) \in S^{\alpha }\) if and only if \((\alpha y_1, \alpha y_2) \in S\), we have

$$\begin{aligned} S^{\alpha }&= \{ (y_1,y_2) \in {{\mathbb {Z}}}^{2} \mid \alpha (p_{i} y_{1} + q_{i} y_{2}) \le r_{i} \ (i=1,\ldots ,m) \} \\&= \{ (y_1,y_2) \in {{\mathbb {Z}}}^{2} \mid p_{i} y_{1} + q_{i} y_{2} \le r'_{i} \ (i=1,\ldots ,m) \}, \end{aligned}$$

where \(r'_{i} = \lfloor r_{i} / \alpha \rfloor \)\((i=1,\ldots ,m)\). By Proposition 2.1 this implies integral convexity of \(S^{\alpha }\). \(\square \)

Next we turn to integrally convex functions.

Theorem 3.2

Let \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function and \(\alpha \in {\mathbb {Z}}_{++}\). Then the scaled function \(f^{\alpha }\) is integrally convex.

Proof

The effective domain \(\mathrm{dom\,}f^{\alpha } = ( \mathrm{dom\,}f \cap (\alpha {{\mathbb {Z}}})^{2} )/\alpha \) is an integrally convex set by Proposition 3.1. By Theorem 2.4 and Proposition 2.3, we only have to check condition (2.2) for \(f^{\alpha }\) with \(x =(0,0)\) and \(y = (2,0)\), (2, 2), (2, 1). That is, it suffices to show

$$\begin{aligned}&f(0,0) + f(2 \alpha , 0) \ge 2 f(\alpha ,0), \end{aligned}$$
(3.1)
$$\begin{aligned}&f(0,0) + f(2 \alpha , 2 \alpha ) \ge 2 f(\alpha ,\alpha ), \end{aligned}$$
(3.2)
$$\begin{aligned}&f(0,0) + f(2 \alpha , \alpha ) \ge f(\alpha ,\alpha ) + f(\alpha ,0). \end{aligned}$$
(3.3)

The first two inequalities, (3.1) and (3.2), follow easily from integral convexity of f, whereas (3.3) is a special case of the basic parallelogram inequality (3.4) below with \(a = b = \alpha \). \(\square \)

Proposition 3.3

(Basic parallelogram inequality). For an integrally convex function \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) we have

$$\begin{aligned} f(0,0) + f(a+b,a) \ge f(a,a) + f(b,0) \qquad (a,b \in {{\mathbb {Z}}}_{+}). \end{aligned}$$
(3.4)

Proof

We may assume \(a, b \ge 1\) and \(\{ (0,0), (a+b,a) \} \subseteq \mathrm{dom\,}f\), since otherwise the inequality (3.4) is trivially true. Since \(\mathrm{dom\,}f\) is integrally convex, Corollary 2.2 implies that \(k(1,1) +l(1,0) \in \mathrm{dom\,}f\) for all (kl) with \(0 \le k \le a\) and \(0 \le l\le b\). We use the notation \(f_{x}(z)=f(x+z)\). For each \(x \in \mathrm{dom\,}f\) we have

$$\begin{aligned} f_{x}( 0,0 ) + f_{x}( 2,1 ) \ge f_{x}( 1,1 ) + f_{x}( 1,0 ) \end{aligned}$$

by integral convexity of f. By adding these inequalities for \(x=k(1,1) +l(1,0)\) with \(0 \le k \le a-1\) and \(0 \le l \le b-1\), we obtain (3.4). Note that all the terms involved in these inequalities are finite, since \(k(1,1) +l(1,0) \in \mathrm{dom\,}f\) for all k and l. \(\square \)

If \(n \ge 3\), \(f^{\alpha }\) is not always integrally convex. This is demonstrated by the following example.

Example 3.1

Consider the integrally convex function \(f: {\mathbb {Z}}^{3} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) defined on \(\mathrm{dom\,}f = [(0,0,0), (4,2,2)]_{{{\mathbb {Z}}}}\) by

For the scaling with \(\alpha = 2\), we have a failure of integral convexity. Indeed, for \(x = (0,0,0)\) and \(y = (2, 1, 1)\) we have

$$\begin{aligned} \widetilde{f^{\alpha }} \, \bigg (\frac{x + y}{2} \bigg )&= \min \left\{ \frac{1}{2}f^{\alpha }(1,1,1) + \frac{1}{2}f^{\alpha }(1,0,0), \ \frac{1}{2}f^{\alpha }(1,1,0) + \frac{1}{2}f^{\alpha }(1,0,1) \right\} \\&= \frac{1}{2} \min \left\{ f(2,2,2) + f(2,0,0), \ f(2,2,0) + f(2,0,2) \right\} \\&= \frac{1}{2} \min \left\{ 1+ 0, \ 1 + 0 \right\} = \frac{1}{2} \\&> 0 = \frac{1}{2} ( f(0,0,0) + f(4,2,2) ) = \frac{1}{2} (f^{\alpha }(x) + f^{\alpha }(y)), \end{aligned}$$

which shows the failure of (2.2) in Theorem 2.4. The set \(S = \arg \min f = \{ x \mid f(x)=0 \}\) is an integrally convex set, and \(S^{\alpha } = \{ x \mid \alpha x \in S \} = \{ (0,0,0), (1,0,0), (1,0,1), (2,1,1) \}\) is not an integrally convex set.

In view of the fact that the class of \(\mathrm{L}^{\natural }\)-convex functions is stable under scaling, while this is not true for the superclass of integrally convex functions, we are naturally led to the question of finding an intermediate class of functions that is stable under scaling. See Sect. 7 for this issue.

4 Preliminary discussion on proximity theorems

Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) and \(\alpha \in {{\mathbb {Z}}}_{++}\). We say that \(x^{\alpha } \in \mathrm{dom\,}f\) is an \(\alpha \)-local minimizer of f (or \(\alpha \)-local minimal for f) if \(f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d)\) for all \(d \in \{ -1,0, +1 \}^{n}\). In general terms a proximity theorem states that for \(\alpha \in {{\mathbb {Z}}}_{++}\) there exists an integer \(B(n,\alpha ) \in {{\mathbb {Z}}}_{+}\) such that if \(x^{\alpha }\) is an \(\alpha \)-local minimizer of f, then there exists a minimizer \(x^{*}\) of f satisfying \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le B(n,\alpha )\), where \(B(n,\alpha )\) is called the proximity distance.

Before presenting a proximity theorem for integrally convex functions in Sect. 5, we establish in this section lower bounds for the proximity distance. We also present a proximity theorem for \(n=2\), as the proof is fairly simple in this particular case, though the proof method does not extend to general \(n \ge 3\).

4.1 Lower bounds for the proximity distance

The following examples provide us with lower bounds for the proximity distance. The first three demonstrate the tightness of the bounds for separable convex functions, \(\mathrm{L}^{\natural }\)-convex and \(\mathrm{M}^{\natural }\)-convex functions given in Theorems 1.2, 1.3 and 1.4, respectively.

Example 4.1

(Separable convex function) Let \(\varphi (t) = \max (-t, (\alpha -1)(t - \alpha ) )\) for \(t \in {{\mathbb {Z}}}\) and define \(f(x) = \varphi (x_{1}) + \cdots + \varphi (x_{n})\), which is separable convex. This function has a unique minimizer at \(x^{*}=(\alpha -1, \ldots , \alpha -1)\), whereas \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal and \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = \alpha - 1\). This shows the tightness of the bound \(\alpha - 1\) given in Theorem 1.2.

Example 4.2

(\(\mathrm{L}^{\natural }\)-convex function) Consider \(X \subseteq {{\mathbb {Z}}}^{n}\) defined by

$$\begin{aligned} X&= \left\{ x \in {{\mathbb {Z}}}^{n} \mid 0 \le x_{i} - x_{i+1} \le \alpha -1 \ (i=1,\ldots ,n-1), \ 0 \le x_{n} \le \alpha -1 \right\} \\&= \left\{ x \in {{\mathbb {Z}}}^{n} \mid x = \sum _{i=1}^{n} \mu _{i} \varvec{1}_{ \left\{ 1,2,\ldots , i \right\} }, \ \ 0 \le \mu _{i} \le \alpha -1 \ (i=1,\ldots ,n) \right\} , \end{aligned}$$

where \( \varvec{1}_{ \{ 1,2,\ldots , i \} } = ( \overbrace{1,1,\ldots , 1}^{i}, 0,0,\ldots , 0)\). The function f defined by \(f(x)=-x_{1}\) on \(\mathrm{dom\,}f =X\) is an \(\mathrm{L}^{\natural }\)-convex function and has a unique minimizer at \(x^{*} = (n (\alpha -1),(n-1)(\alpha -1),\ldots ,2(\alpha -1),\alpha -1)\). On the other hand, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(X \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\). We have \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = n (\alpha - 1)\), which shows the tightness of the bound \(n (\alpha - 1)\) given in Theorem 1.3. This example is a reformulation of [26, Remark 2.3] for L-convex functions to \(\mathrm{L}^{\natural }\)-convex functions.

Example 4.3

(\(\mathrm{M}^{\natural }\)-convex function) Consider \(X \subseteq {{\mathbb {Z}}}^{n}\) defined by

$$\begin{aligned} X = \{ x \in {{\mathbb {Z}}}^{n} \mid 0&\le x_{1} + x_{2} + \cdots + x_{n} \le \alpha -1, -(\alpha -1) \le x_{i} \le 0 \ (i=2,\ldots ,n) \} \\ = \{ x \in {{\mathbb {Z}}}^{n} \mid x&= (\mu _{1} + \mu _{2} + \cdots + \mu _{n}, -\mu _{2},-\mu _{3}, \ldots , -\mu _{n}), \\ 0&\le \mu _{i} \le \alpha -1 \ (i=1,\ldots ,n) \}. \end{aligned}$$

The function f defined by \(f(x)=-x_{1}\) on \(\mathrm{dom\,}f =X\) is an \(\mathrm{M}^{\natural }\)-convex function and has a unique minimizer at \(x^{*} = (n (\alpha -1),-\,(\alpha -1), -\,(\alpha -1),\ldots ,-\,(\alpha -1))\). On the other hand, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(X \cap \{ -\,\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\). We have \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } = n (\alpha - 1)\), which shows the tightness of the bound \(n (\alpha - 1)\) given in Theorem 1.4. This example is a reformulation of [26, Remark 2.8] for M-convex functions to \(\mathrm{M}^{\natural }\)-convex functions.

For integrally convex functions with \(n \ge 3\), the bound \(n (\alpha -1)\) is no longer valid. This is demonstrated by the following examples.

Example 4.4

Consider an integrally convex function \(f: {\mathbb {Z}}^{3} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) defined on \(\mathrm{dom\,}f = [(0,0,0), (4,2,2)]_{{{\mathbb {Z}}}}\) by

and let \(\alpha = 2\). For \(x^{\alpha }=(0,0,0)\) we have \(f(x^{\alpha })=0\) and \(f(x^{\alpha }) \le f(x^{\alpha }+ 2 d)\) for \(d=(1,0,0), (0,1,0)\), (0, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (1, 1, 1). Hence \(x^{\alpha }=(0,0,0)\) is \(\alpha \)-local minimal. A unique (global) minimizer of f is located at \(x^{*}=(4,2,2)\) with \(f(x^{*})=-\,4\) and \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } = 4\). The \(\ell _{\infty }\)-distance between \(x^{\alpha }\) and \(x^{*}\) is strictly larger than \(n (\alpha -1) = 3\). We remark that the scaled function \(f^{\alpha }\) is not integrally convex.

The following example demonstrates a quadratic lower bound in n for the proximity distance for integrally convex functions.

Fig. 3
figure 3

Example for \(O(n^{2})\) lower bound for proximity distance (\(m=3\))

Example 4.5

For a positive integer \(m \ge 1\), we consider two bipartite graphs \(G_{1}\) and \(G_{2}\) on vertex bipartition \((\{ 0^{+}, 1^{+},\ldots ,m^{+} \}, \{ 0^{-}, 1^{-},\ldots ,m^{-} \})\); see Fig. 3. The edge sets of \(G_{1}\) and \(G_{2}\) are defined respectively as \(E_{1} = \{ (0^{+}, 0^{-}) \} \cup \{ (i^{+}, j^{-}) \mid i,j=1,\ldots ,m \}\) and \(E_{2} = \{ (0^{+}, j^{-}) \mid j=1,\ldots ,m \} \cup \{ (i^{+}, 0^{-}) \mid i=1,\ldots ,m \}\). Let \(V^{+} = \{ 1^{+},\ldots ,m^{+} \}\), \(V^{-} = \{ 1^{-},\ldots ,m^{-} \}\), and \(n = 2m + 2\). Consider \(X_{1}, X_{2} \subseteq {{\mathbb {Z}}}^{n}\) defined by

$$\begin{aligned} X_{1}&= \left\{ \sum _{i=1}^{m}\sum _{j=1}^{m} \lambda _{ij}(\varvec{1}_{i^{+}}{-}\varvec{1}_{j^{-}}) + \lambda _{0} (\varvec{1}_{0^{+}}{-}\varvec{1}_{0^{-}})\; \left| \begin{array}{l} \lambda _{ij} \in [0,\alpha -1]_{{{\mathbb {Z}}}} \; (i,j=1,\ldots ,m) \\ \lambda _{0} \in [0,m^{2}(\alpha -1)]_{{{\mathbb {Z}}}} \end{array}\right. \right\} , \\ X_{2}&= \left\{ \sum _{i=1}^{m}\mu _{i}(\varvec{1}_{i^{+}}{-}\varvec{1}_{0^{-}}) + \sum _{j=1}^{m}\nu _{j}(\varvec{1}_{0^{+}}{-}\varvec{1}_{j^{-}})\; \left| \begin{array}{l} \mu _{i} \in [0,m(\alpha -1)]_{{{\mathbb {Z}}}} \; (i=1,\ldots ,m) \\ \nu _{j} \in [0,m(\alpha -1)]_{{{\mathbb {Z}}}} \; (j=1,\ldots ,m) \end{array} \right. \right\} , \end{aligned}$$

where \(X_{1}\) and \(X_{2}\) represent the sets of boundaries of flows in \(G_{1}\) and \(G_{2}\), respectively. We define functions \(f_{1},f_{2} : {{\mathbb {Z}}}^{n} \rightarrow {{\mathbb {R}}}\cup \{ +\,\infty \}\) with \(\mathrm{dom\,}f_{1} = X_{1}\) and \(\mathrm{dom\,}f_{2} = X_{2}\) by

$$\begin{aligned} f_{1}(x) = \left\{ \begin{array}{ll} x(V^{-}) &{} (x \in X_{1}), \\ +\,\infty &{} (x \not \in X_{1}), \end{array}\right. \quad f_{2}(x) = \left\{ \begin{array}{ll} x(V^{-}) &{} (x \in X_{2}), \\ +\,\infty &{} (x \not \in X_{2}) \end{array}\right. \qquad (x \in {{\mathbb {Z}}}^{n}), \end{aligned}$$

where \(x(U) = \sum _{u \in U} x_{u}\) for any set U of vertices. Both \(f_{1}\) and \(f_{2}\) are M-convex, and hence \(f = f_{1} + f_{2}\) is an \(\mathrm{M}_{2}\)-convex function, which is integrally convex (see [23, Section 8.3.1]). We have \(\mathrm{dom\,}f = \mathrm{dom\,}f_{1} \cap \mathrm{dom\,}f_{2} = X_{1} \cap X_{2}\) and f is linear on \(\mathrm{dom\,}f\). As is easily verified, f has a unique minimizer at \(x^{*}\) defined by

$$\begin{aligned} x^{*}_{u} = \left\{ \begin{array}{ll} m(\alpha -1) &{} (u \in V^{+}), \\ -\,m(\alpha -1) &{} (u \in V^{-}), \\ m^{2}(\alpha -1) &{} (u = 0^{+}), \\ -\,m^{2}(\alpha -1) &{} (u = 0^{-}), \end{array}\right. \end{aligned}$$

which corresponds to \(\lambda _{0} = m^{2}(\alpha -1)\), \(\lambda _{ij} = \alpha -1\), \(\mu _{i} = \nu _{j} = m(\alpha -1)\)\((i,j=1,\ldots ,m)\). We mention that the function f here is constructed in [26, Remark 2.19] for a slightly different purpose (i.e., for \(\mathrm{M}_{2}\)-proximity theorem).

Let \(x^{\alpha } = \mathbf{0}\). Obviously, \(\mathbf{0}\in \mathrm{dom\,}f\). Moreover, \(x^{\alpha } = \mathbf{0}\) is \(\alpha \)-local minimal, since \(\mathrm{dom\,}f \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\), as shown below. Since \(\Vert x^{*}-x^{\alpha }\Vert _{\infty } = m^{2}(\alpha -1) = (n-2)^{2}(\alpha -1)/4\), we obtain a quadratic lower bound \((n-2)^{2}(\alpha -1)/4\) for the proximity distance for integrally convex functions.

The proof of \(\mathrm{dom\,}f \cap \{ -\alpha , 0,\alpha \}^{n} = \{ \varvec{0} \}\) goes as follows. Let \(x \in X_{1} \cap X_{2} \cap \{ -\alpha , 0,\alpha \}^{n}\). We have \(x_{0^{+}} \in \{ 0, \alpha \}\) and \(x_{0^{-}} \in \{ 0, -\alpha \}\). We consider four cases to conclude that \(x = \varvec{0}\).

  1. (i)

    Case of \(x_{0^{+}} = x_{0^{-}} = 0\): The structure of \(X_{2}\) forces \(x = \varvec{0}\).

  2. (ii)

    Case of \(x_{0^{+}} = \alpha \), \(x_{0^{-}} = 0\): The structure of \(X_{2}\) forces \(x_{i^{+}} = 0\) for \(i=1, \ldots , m\) and

    $$\begin{aligned} x_{j^{-}} = \left\{ \begin{array}{ll} -\,\alpha &{} (j = j_{0}), \\ 0 &{} (j \not = j_{0}) \\ \end{array}\right. \end{aligned}$$

    for some \(j_{0}\)\((1 \le j_{0} \le m\)), but this is impossible by the structure of \(X_{1}\).

  3. (iii)

    Case of \(x_{0^{+}} = 0\), \(x_{0^{-}} = -\,\alpha \): The proof is similar to that of (ii) above.

  4. (iv)

    Case of \(x_{0^{+}} = \alpha \), \(x_{0^{-}} = -\,\alpha \): The structure of \(X_{2}\) forces

    $$\begin{aligned} x_{i^{+}} = \left\{ \begin{array}{ll} \alpha &{} (i = i_{0}), \\ 0 &{} (i \not = i_{0}), \\ \end{array}\right. \quad x_{j^{-}} = \left\{ \begin{array}{ll} -\,\alpha &{} (j = j_{0}), \\ 0 &{} (j \not = j_{0}) \\ \end{array}\right. \end{aligned}$$

    for some \(i_{0}\)\((1 \le i_{0} \le m\)) and \(j_{0}\)\((1 \le j_{0} \le m\)), but this is impossible by the structure of \(X_{1}\).

We have seen that the proximity theorem with the linear bound \(n (\alpha -1)\) does not hold for all integrally convex functions. Then a natural question arises: can we establish a proximity theorem at all by enlarging the proximity bound? This question is answered in the affirmative in Sect. 5.

4.2 A proximity theorem for integrally convex functions with \(n=2\)

In the case of \(n=2\) the proximity bound \(n (\alpha - 1) = 2 (\alpha - 1)\) is valid for integrally convex functions.Footnote 2

Theorem 4.1

Let \(f: {\mathbb {Z}}^{2} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, \(\alpha \in {\mathbb {Z}}_{++}\), and \(x^{\alpha } \in \mathrm{dom\,}f\). If \(f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d)\) for all \(d \in \{ -1,0, +1 \}^{2}\), then there exists a minimizer \(x^{*} \in {{\mathbb {Z}}}^{2}\) of f with \( \Vert x^{\alpha } - x^{*} \Vert _{\infty } \le 2 (\alpha - 1)\).

Proof

We may assume \(\alpha \ge 2\) and \(x^{\alpha } = \varvec{0}\) by Proposition 2.3 (1). Define

$$\begin{aligned} C&= \{ (x_{1},x_{2}) \in {{\mathbb {Z}}}^{2} \mid 0 \le x_{2} \le x_{1} \}, \\ S&= \{ (x_{1},x_{2}) \in {{\mathbb {Z}}}^{2} \mid 0 \le x_{2} \le x_{1} \le 2 (\alpha -1) \}. \end{aligned}$$

Let \(\mu \) be the minimum of \(f(x_{1},x_{2})\) over \((x_{1},x_{2}) \in S\) and let \(({{\hat{x}}}_{1},{{\hat{x}}}_{2})\) be a point in S with \(f({{\hat{x}}}_{1},{{\hat{x}}}_{2})=\mu \). Then

$$\begin{aligned} f(x_{1},x_{2}) \ge \mu \qquad ((x_{1},x_{2}) \in S). \end{aligned}$$
(4.1)

We will show that

$$\begin{aligned} f(2 \alpha - 1,k) \ge \mu \qquad (0 \le k \le 2 \alpha -1). \end{aligned}$$
(4.2)

Then, by Corollary 2.7 (hyperplane-barrier property), it follows that \(f(z_{1},z_{2}) \ge \mu \) for all \((z_{1},z_{2}) \in C\), that is, there is no \((z_{1},z_{2}) \in C {\setminus } S\) with \(f(z_{1},z_{2}) < \mu \). This proves the claim of the theorem, since \({{\mathbb {Z}}}^{2}\) can be covered by eight sectors similar to C and Proposition 2.3 holds.

The basic parallelogram inequality (3.4) with \(a = k\) and \(b =2 \alpha -1-k\) yields

$$\begin{aligned} f(0,0) + f(2 \alpha - 1,k) \ge f(k,k) + f(2 \alpha -1-k,0). \end{aligned}$$
(4.3)

Case 1 \(0 \le k \le \alpha -1\). Since \(2 \alpha -1-k \ge \alpha \), by convexity of f(t, 0) in t, we have

$$\begin{aligned} \frac{1}{2 \alpha -1-k}[f(2 \alpha -1-k,0) - f(0,0)] \ge \frac{1}{\alpha }[f(\alpha ,0) - f(0,0)] \ge 0. \end{aligned}$$

On the other hand, \(f(k,k) \ge \mu \) by (4.1). Then it follows from (4.3) that

$$\begin{aligned} f(2 \alpha - 1,k) \ge f(k,k) +[ f(2 \alpha -1-k,0) - f(0,0) ] \ge \mu . \end{aligned}$$

Case 2 \(\alpha \le k \le 2 \alpha -1\). Since \(k \ge \alpha \), by convexity of f(tt) in t, we have

$$\begin{aligned} \frac{1}{k}[f(k,k) - f(0,0)] \ge \frac{1}{\alpha }[f(\alpha ,\alpha ) - f(0,0)] \ge 0. \end{aligned}$$

On the other hand, \(f(2 \alpha -1-k,0) \ge \mu \) by (4.1). Then it follows from (4.3) that

$$\begin{aligned} f(2 \alpha - 1,k) \ge f(2 \alpha -1-k,0) + [ f(k,k) - f(0,0) ] \ge \mu . \end{aligned}$$

We have thus shown (4.2), completing the proof of Theorem 4.1. \(\square \)

5 A proximity theorem for integrally convex functions

In this section we establish a proximity theorem for integrally convex functions in an arbitrary number of variables.

5.1 Main result

Theorem 5.1

Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function, \(\alpha \in {\mathbb {Z}}_{++}\), and \(x^{\alpha } \in \mathrm{dom\,}f\).

  1. (1)

    If

    $$\begin{aligned} f(x^{\alpha }) \le f(x^{\alpha }+ \alpha d) \qquad (\forall \ d \in \{ -1,0, +1 \}^{n}), \end{aligned}$$
    (5.1)

    then \(\arg \min f \not = \emptyset \) and there exists \(x^{*} \in \arg \min f\) with

    $$\begin{aligned} \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \beta _{n} (\alpha - 1), \end{aligned}$$
    (5.2)

    where \(\beta _{n}\) is defined by

    $$\begin{aligned} \beta _{1}=1, \quad \beta _{2}=2; \qquad \beta _{n} = \frac{n+1}{2} \beta _{n-1} + 1 \quad (n=3,4,\ldots ). \end{aligned}$$
    (5.3)
  2. (2)

    The coefficient \(\beta _{n}\) of the proximity bound satisfies

    $$\begin{aligned} \beta _{n} \le \frac{(n+1)!}{2^{n-1}} \qquad (n=3,4,\ldots ). \end{aligned}$$
    (5.4)

The numerical values of \(\beta _{n}\) and its bounds are as follows:

(5.5)

Remark 5.1

The bound (5.2) can be strengthened to \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \lfloor \beta _{n} (\alpha - 1) \rfloor \), but \( \Vert x^{\alpha } - x^{*}\Vert _{\infty } \le \lfloor \beta _{n} \rfloor (\alpha - 1)\) may not be correct (our proof does not justify this).

To prove Theorem 5.1 (1) we first note that the theorem follows from its special case where \(x^{\alpha } = \varvec{0}\) and f is defined on a bounded set in the nonnegative orthant \({{\mathbb {Z}}}_{+}^{n}\). That is, the proof of Theorem 5.1 (1) is reduced to proving the following proposition. We use the notation \(N=\{1,2,\ldots ,n\}\) and \(\varvec{1}_{A}\) for the characteristic vector of \(A \subseteq N\).

Proposition 5.2

Let \(\alpha \in {{\mathbb {Z}}}_{++}\) and \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function such that \(\mathrm{dom\,}f\) is a bounded subset of  \({{\mathbb {Z}}}_{+}^{n}\) containing the origin \(\varvec{0}\). If

$$\begin{aligned} f(\varvec{0}) \le f(\alpha \varvec{1}_{A}) \qquad (\forall A \subseteq N), \end{aligned}$$
(5.6)

then there exists \(x^{*} \in \arg \min f\) with

$$\begin{aligned} \Vert x^{*}\Vert _{\infty } \le \beta _{n} (\alpha - 1), \end{aligned}$$
(5.7)

where \(\beta _{n}\) is defined by (5.3).

Suppose that Proposition 5.2 has been established. Then Theorem 5.1 (1) can be derived from Proposition 5.2 in three steps:

  1. 1.

    We may assume \(x^{\alpha } = \varvec{0}\) by Proposition 2.3 (1).

  2. 2.

    We may further assume that \(\mathrm{dom\,}f\) is bounded. Let M be a sufficiently large integer, say, \(M \ge \beta _{n} (\alpha - 1) + 1\), and \(f_{M}\) be the restriction of f to the integer interval \([ -M \varvec{1}, M \varvec{1} ]_{{{\mathbb {Z}}}}\), where \(\varvec{1}=(1,1,\ldots , 1)\). Then \(x^{\alpha } = \varvec{0}\) is \(\alpha \)-local minimal for \(f_{M}\). If the special case of Theorem 5.1 with \(x^{\alpha } = \varvec{0}\) and bounded \(\mathrm{dom\,}f\) is true, then there exists \(x^{*} \in \arg \min f_{M}\) satisfying \( \Vert x^{*}\Vert _{\infty } \le \beta _{n} (\alpha - 1)\). Since \(x^{*} \in \arg \min f_{M}\) we have \(f_{M}(x^{*}) \le f_{M}(x^{*}+ d)\)\((\forall \, d \in \{ -1,0, +1 \}^{n})\), which implies \(f(x^{*}) \le f(x^{*}+ d)\)\((\forall \, d \in \{ -1,0, +1 \}^{n})\). Then Theorem 2.5 shows that \(x^{*} \in \arg \min f\).

  3. 3.

    We consider \(2^{n}\) orthants separately. For each \(s=(s_{1}, s_{2}, \ldots , s_{n}) \in \{ +1, -1 \}^{n}\) we consider the function \(f_{s}(x)=f(s x)\) on \({{\mathbb {Z}}}_{+}^{n}\), where \(s x = (s_{1} x_{1}, s_{2} x_{2}, \ldots , s_{n} x_{n})\). Noting that \(\mathrm{dom\,}f_{s}\) is a bounded subset of \({{\mathbb {Z}}}_{+}^{n}\), we apply Proposition 5.2 to \(f_{s}\) to obtain \(x^{*}_{s}\) with \( \Vert x^{*}_{s} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\). From among \(2^{n}\) such \(x^{*}_{s}\), take the one with the function value \(f(s x^{*}_{s})\) minimum. Then \(x^{*} = s x^{*}_{s}\) is a minimizer of f, and satisfies \( \Vert x^{*} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\).

5.2 Tools for the proof: f-minimality

In this section we introduce some technical tools that we use in the proof of Proposition 5.2.

For \(A\;(\ne \emptyset ) \subseteq N\), we consider a set of integer vectors

$$\begin{aligned} B_{A}&= \{ \varvec{1}_{A} + \varvec{1}_{i}, \varvec{1}_{A} - \varvec{1}_{i} \mid i \in A \} \cup \{\varvec{1}_{A}+ \varvec{1}_{i} \mid i \in N {\setminus } A \} \cup \{\varvec{1}_{A}\}, \end{aligned}$$
(5.8)

and the cones of their nonnegative integer and real combinations

$$\begin{aligned} C_{A}&= \left\{ \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^ {-} (\varvec{1}_{A} - \varvec{1}_{i}) \right. \nonumber \\&\quad \left. + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i}) + \lambda \varvec{1}_{A} \mid \mu _{i}^{+},\mu _{i}^{-}, \mu _{i}^{\circ }, \lambda \in {{\mathbb {Z}}}_{+} \right\} , \end{aligned}$$
(5.9)
$$\begin{aligned} {\tilde{C}}_{A}&= \left\{ \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^ {-} (\varvec{1}_{A} - \varvec{1}_{i}) \right. \nonumber \\&\quad \left. + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i}) + \lambda \varvec{1}_{A} \mid \mu _{i}^{+},\mu _{i}^{-}, \mu _{i}^{\circ }, \lambda \in {{\mathbb {R}}}_{+} \right\} , \end{aligned}$$
(5.10)

where \(C_{A}\) is often referred to as the integer cone generated by \(B_{A}\). We first note the following fact, which provides us with a clearer geometric view, though it is not used in the proof of Proposition 5.2.

Proposition 5.3

\(B_{A}\) is a Hilbert basis of the convex cone \({\tilde{C}}_{A}\) generated by \(B_{A}\). That is, \(C_{A} = {\tilde{C}}_{A} \cap {\mathbb {Z}}^{n}\).

Proof

The proof is given in “Appendix B”. \(\square \)

For two nonnegative integer vectors \(x, y \in {{\mathbb {Z}}}_{+}^{n}\), we write \(y \preceq _{f} x\) if \(y \le x\) and \(f(y) \le f(x)\). Note that \(y \preceq _{f} x\) if and only if \((y,f(y)) \le (x,f(x))\) in \({{\mathbb {R}}}^{n} \times ({{\mathbb {R}}}\cup \{ +\,\infty \})\). We say that \(x \in {{\mathbb {Z}}}_{+}^{n}\) is f-minimal if \(x \in \mathrm{dom\,}f\) and there exists no \(y \in {{\mathbb {Z}}}_{+}^{n}\) such that \(y \preceq _{f} x\) and \(y \not = x\). That is,Footnote 3x is f-minimal if and only if it is the unique minimizer of the function f restricted to the integer interval \([\varvec{0},x]_{{{\mathbb {Z}}}}\).

The goal of this section is to establish the following connection between f-minimality and the integer cone \(C_{A}\) based at \(\alpha \varvec{1}_{A}\).

Proposition 5.4

Assume \(\alpha \)-local minimality (5.6). If \(y \in {{\mathbb {Z}}}_{+}^{n}\) is f-minimal, then \(y \not \in \alpha \varvec{1}_{A} + C_{A}\) for any \(A (\ne \emptyset ) \subseteq N\).

Our proof of this proposition is based on several lemmas.

Lemma 5.5

Assume \(\alpha \)-local minimality (5.6). For any \(A\;(\ne \emptyset ) \subseteq N\) and \(\lambda \in {{\mathbb {Z}}}_{+}\) we have \((\alpha -1) \varvec{1}_{A} \preceq _{f} (\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A}\).

Proof

First note that \((\alpha -1) \varvec{1}_{A} \le (\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A}\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\). By integral convexity of f, \(g(\lambda ) = f(\lambda \varvec{1}_{A})\) is a discrete convex function in \(\lambda \in {{\mathbb {Z}}}_{+}\), and therefore,

$$\begin{aligned} g(\alpha -1) \le \frac{\alpha -1}{\alpha } g(\alpha ) + \frac{1}{\alpha } g(0). \end{aligned}$$

On the other hand, \(g(0) \le g(\alpha )\) by the assumed \(\alpha \)-local minimality (5.6). Hence we have \(g(\alpha -1) \le g(\alpha )\). Since \(g(0) < +\,\infty \), by discrete convexity of g, this implies \(g(\alpha -1) \le g((\alpha -1)+\lambda )\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\), i.e., \(f((\alpha -1) \varvec{1}_{A}) \le f((\alpha -1)\varvec{1}_{A} +\lambda \varvec{1}_{A})\) for all \(\lambda \in {{\mathbb {Z}}}_{+}\). \(\square \)

Lemma 5.6

Let \(x \in \mathrm{dom\,}f\), \(A\;(\ne \emptyset ) \subseteq N\), and assume \(x \preceq _{f} x + \varvec{1}_{A}\). Then for any \(i \in N\), \(\delta \in \{ +1, 0, -1 \}\), and \(\lambda \in {{\mathbb {Z}}}_{+}\) we have \(x+\varvec{1}_{A}+ \delta \varvec{1}_{i} \preceq _{f} (x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}\).

Proof

First note that \(x+\varvec{1}_{A}+ \delta \varvec{1}_{i} \le (x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}\). We only need to show \(f(x+\varvec{1}_{A}+ \delta \varvec{1}_{i}) \le f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A})\) when \(f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}) < +\,\infty \). By integral convexity of f we have

$$\begin{aligned}&\frac{1}{\lambda +1} f\left( \left( x + \varvec{1}_{A} + \delta \varvec{1}_{i}\right) + \lambda \varvec{1}_{A}\right) + \frac{\lambda }{\lambda +1} f\left( x\right) \\&\quad \ge {{\tilde{f}}}\left( x + \varvec{1}_{A} + \frac{\delta }{\lambda +1} \varvec{1}_{i}\right) \\&\quad = \frac{1}{\lambda +1} f\left( x + \varvec{1}_{A} + \delta \varvec{1}_{i}\right) + \frac{\lambda }{\lambda +1} f\left( x + \varvec{1}_{A}\right) , \end{aligned}$$

whereas \(f(x+\varvec{1}_{A}) \ge f(x)\) by the assumption. Hence \(f((x + \varvec{1}_{A} + \delta \varvec{1}_{i}) + \lambda \varvec{1}_{A}) \ge f(x+\varvec{1}_{A}+ \delta \varvec{1}_{i})\). \(\square \)

Lemma 5.7

Let \(x \in \mathrm{dom\,}f\), \(A\;(\ne \emptyset ) \subseteq N\), and assume \(x \preceq _{f} x + \varvec{1}_{A}\). For any \(\lambda \in {{\mathbb {Z}}}_{+}\), \(\mu _{i}^{+},\mu _{i}^{-} \in {{\mathbb {Z}}}_{+} \;(i \in A)\), and \(\mu _{i}^{\circ } \in {{\mathbb {Z}}}_{+}\;(i \in N {\setminus } A)\), the point

$$\begin{aligned} y = x + \varvec{1}_{A} + \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^{-} (\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i}) + \lambda \varvec{1}_{A}\nonumber \\ \end{aligned}$$
(5.11)

is not f-minimal.

Proof

By the definition of an f-minimal point, we assume \(y \in \mathrm{dom\,}f\); since otherwise we are done. Define

$$\begin{aligned} \mu = \sum _{i \in A} \left( \mu _{i}^{+}+\mu _{i}^{-}\right) + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ }, \end{aligned}$$
(5.12)

which serves, in our proof, as an index to measure the distance between x and y. If \(\mu \le 1\), then y is not f-minimal by Lemma 5.6. Suppose that \(\mu \ge 2\). In the following we construct a vector \(x'\) such that \(x' \in \mathrm{dom\,}f\), \(x' \preceq _{f} x' + \varvec{1}_{A}\), y is represented as (5.11) with \(x'\) in place of x, and the index \(\mu '\) for that representation is strictly smaller than \(\mu \).

Define \(\beta = \mu + \lambda + 1\) and

$$\begin{aligned} A^{+}&= \left\{ i \in A \mid \mu _{i}^{+} \ge 1 \right\} , \\ A^{-}&= \left\{ i \in A \mid \mu _{i}^{-} \ge 1 \right\} , \\ A^{=}&= \left\{ i \in A \mid \mu _{i}^{+} = \mu _{i}^{-} = 0 \right\} , \\ A^{\circ }&= \left\{ i \in N {\setminus } A \mid \mu _{i}^{\circ } \ge 1 \right\} , \end{aligned}$$

where we may assume, without loss of generality, that \(A^{+} \cap A^{-} = \emptyset \). Then (5.11) can be rewritten as

$$\begin{aligned} y = x + \sum _{i \in A^{+}} \mu _{i}^{+}\varvec{1}_{i} - \sum _{i \in A^{-}} \mu _{i}^{-} \varvec{1}_{i} + \sum _{i \in A^{\circ }} \mu _{i}^{\circ } \varvec{1}_{i} + \beta \varvec{1}_{A}, \end{aligned}$$

which shows

$$\begin{aligned} (y-x)_{i} = \left\{ \begin{array}{ll} \beta + \mu _{i}^{+} &{}\quad (i \in A^{+}), \\ \beta - \mu _{i}^{-} &{}\quad (i \in A^{-}), \\ \beta &{}\quad (i \in A^{=}), \\ \mu _{i}^{\circ } &{}\quad (i \in A^{\circ }), \\ 0 &{}\quad (\text{ otherwise }). \end{array} \right. \end{aligned}$$

Consider the point

$$\begin{aligned} z = \frac{\beta -1}{\beta } \ x + \frac{1}{\beta } \ y, \end{aligned}$$

which is not contained in \({{\mathbb {Z}}}^{n}\) since \( z = x + (y-x)/\beta \) and \(\displaystyle 1 \le \max \big ( \max \nolimits _{i \in A^{+}} \mu _{i}^{+}, \max \nolimits _{i \in A^{-}} \mu _{i}^{-}, \max \nolimits _{i \in A^{\circ }} \mu _{i}^{\circ } \big ) \le \beta - 1\) with \(A^{+} \cup A^{-} \cup A^{\circ } \not = \emptyset \). Since f is integrally convex and \(x, y \in \mathrm{dom\,}f\), we have \({\tilde{f}}(z) \le ((\beta -1)/\beta ) f(x) + (1/\beta ) f(y) < +\,\infty \). On the other hand, since

$$\begin{aligned} (z-x)_{i} = \left\{ \begin{array}{ll} 1 + ({\mu _{i}^{+}}/{\beta }) &{}\quad (i \in A^{+}), \\ 1 - ({\mu _{i}^{-}}/{\beta }) &{}\quad (i \in A^{-}), \\ 1 &{}\quad (i \in A^{=}), \\ {\mu _{i}^{\circ }}/{\beta } &{}\quad (i \in A^{\circ }), \\ 0 &{}\quad (\text{ otherwise }), \end{array} \right. \end{aligned}$$

the integral neighborhood N(z) of z consists of all points \(x'\) that can be represented as

$$\begin{aligned} x' = x + \varvec{1}_{A} + \varvec{1}_{A^{+} \cap D} - \varvec{1}_{A^{-} \cap D} + \varvec{1}_{A^{\circ } \cap D} \end{aligned}$$
(5.13)

for a subset D of \(A^{+} \cup A^{-} \cup A^{\circ }\). Since \({\tilde{f}}(z) < +\,\infty \) and \(z \not \in {{\mathbb {Z}}}^{n}\), we must have \(|N(z) \cap \mathrm{dom\,}f| \ge 2\), which implies that there exists a nonempty D for which \(x' \in \mathrm{dom\,}f\). Take such D that is minimal with respect to set inclusion.

We claim that \(x' \preceq _{f} x' + \varvec{1}_{A}\). Obviously we have \(x' \le x' + \varvec{1}_{A}\). To show \(f(x') \le f(x' + \varvec{1}_{A})\), we may assume \(x' + \varvec{1}_{A} \in \mathrm{dom\,}f\). Then we have the following chain of inequalities:

$$\begin{aligned}&f\left( x\right) + f\left( x' +\varvec{1}_{A}\right) \\&\quad = f\left( x\right) + f\left( x + 2\varvec{1}_{A} + \varvec{1}_{A^{+} \cap D} - \varvec{1}_{A^{-} \cap D} + \varvec{1}_{A^{\circ } \cap D}\right) \\&\quad \ge 2 {\tilde{f}}\left( x + \varvec{1}_{A} + \frac{1}{2}\varvec{1}_{A^{+} \cap D} - \frac{1}{2}\varvec{1}_{A^{-} \cap D} + \frac{1}{2}\varvec{1}_{A^{\circ } \cap D}\right)&[\text{ by } \text{ integral } \text{ convexity } \text{ of } f] \\&\quad = f\left( x + \varvec{1}_{A}\right) + f\left( x + \varvec{1}_{A} + \varvec{1}_{A^{+} \cap D} - \varvec{1}_{A^{-} \cap D} + \varvec{1}_{A^{\circ } \cap D}\right)&[\text{ by } \text{ minimality } \text{ of } D] \\&\quad = f\left( x + \varvec{1}_{A}\right) + f\left( x'\right)&[\text{ by } \left( 5.13\right) ] \\&\quad \ge f\left( x\right) + f\left( x'\right)&[\text{ by } x \preceq _{f} x + \varvec{1}_{A}] \end{aligned}$$

which shows \(f(x' + \varvec{1}_{A}) \ge f(x')\). Therefore, \(x' \preceq _{f} x' + \varvec{1}_{A}\) is true.

We finally consider the index (5.12) associated with \(x'\), which we denote by \(\mu '\). The substitution of (5.13) into (5.11) yields

$$\begin{aligned} y&= x + \varvec{1}_{A} + \sum _{i \in A^{+}} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A^{-}} \mu _{i}^{-} (\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in A^{\circ }} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i}) + \lambda \varvec{1}_{A} \nonumber \\&= x' + \varvec{1}_{A} + \left( \sum _{i \in A^{+} {\setminus } D} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A^{+} \cap D} (\mu _{i}^{+}-1)(\varvec{1}_{A} + \varvec{1}_{i}) \right) \nonumber \\&\quad + \left( \sum _{i \in A^{-} {\setminus } D} \mu _{i}^{-}(\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in A^{-} \cap D} (\mu _{i}^{-}-1)(\varvec{1}_{A} - \varvec{1}_{i}) \right) \nonumber \\&\quad + \left( \sum _{i \in A^{\circ } {\setminus } D} \mu _{i}^{\circ }(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A^{\circ } \cap D} (\mu _{i}^{\circ }-1)(\varvec{1}_{A} + \varvec{1}_{i}) \right) + (\lambda + |D| -1) \varvec{1}_{A}. \end{aligned}$$
(5.14)

This shows \(\mu ' = \mu - |D| \le \mu -1\).

The above procedure finds \(x' \in \mathrm{dom\,}f\) such that \(x' \preceq _{f} x' + \varvec{1}_{A}\) and \(\mu '\le \mu -1\), when given \(x \in \mathrm{dom\,}f\) such that \(x \preceq _{f} x + \varvec{1}_{A}\) and \(\mu \ge 2\). By repeated application of this procedure we can eventually arrive at \(x'' \in \mathrm{dom\,}f\) such that \(x'' \preceq _{f} x'' + \varvec{1}_{A}\) and \(\mu ''\le 1\). Then y is not f-minimal by Lemma 5.6 for \(x''\). \(\square \)

Lemma 5.8

If \((\alpha \varvec{1}_{A} + C_A) \cap (\mathrm{dom\,}f) \ne \emptyset \), then \(\alpha \varvec{1}_{A} \in \mathrm{dom\,}f\).

Proof

To prove by contradiction, take \(y \in (\alpha \varvec{1}_{A} + C_A) \cap (\mathrm{dom\,}f)\) that is minimal with respect to the vector ordering (componentwise ordering) and assume that \(y \ne \alpha \varvec{1}_{A}\). The vector y can be represented as

$$\begin{aligned} y = \alpha \varvec{1}_{A}+ \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^{-} (\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i})+ \lambda \varvec{1}_{A} \end{aligned}$$

with some \(\mu _{i}^{+},\mu _{i}^{-} \in {{\mathbb {Z}}}_{+} \;(i \in A)\), \(\mu _{i}^{\circ } \in {{\mathbb {Z}}}_{+}\;(i \in N {\setminus } A)\), and \(\lambda \in {{\mathbb {Z}}}_{+}\), where

$$\begin{aligned} \beta = \alpha + \sum _{i \in A^{+}}\mu _{i}^{+}+ \sum _{i \in A^{-}} \mu _{i}^{-} + \sum _{i \in A^{\circ }} \mu _{i}^{\circ } + \lambda \end{aligned}$$

is strictly larger than \(\alpha \) since \(y \ne \alpha \varvec{1}_{A}\). Define

$$\begin{aligned} A^{+}&= \left\{ i \in A \mid \mu _{i}^{+} \ge 1 \right\} , \\ A^{-}&= \left\{ i \in A \mid \mu _{i}^{-} \ge 1 \right\} , \\ A^{=}&= \left\{ i \in A \mid \mu _{i}^{+} = \mu _{i}^{-} = 0 \right\} , \\ A^{\circ }&= \left\{ i \in N {\setminus } A \mid \mu _{i}^{\circ } \ge 1 \right\} , \end{aligned}$$

where we may assume, without loss of generality, that \(A^{+} \cap A^{-} = \emptyset \) and \(A^{-} \not = A\). We have

$$\begin{aligned} y_{i} = \left\{ \begin{array}{ll} \beta + \mu _{i}^{+} &{}\quad (i \in A^{+}), \\ \beta - \mu _{i}^{-} &{}\quad (i \in A^{-}), \\ \beta &{}\quad (i \in A^{=}), \\ \mu _{i}^{\circ } &{}\quad (i \in A^{\circ }), \\ 0 &{}\quad (\text{ otherwise }). \end{array} \right. \end{aligned}$$

Consider the point

$$\begin{aligned} z = \frac{\beta -1}{\beta } \ y + \frac{1}{\beta } \ \mathbf{0}. \end{aligned}$$

Since f is integrally convex and \(y, \mathbf{0}\in \mathrm{dom\,}f\), we have \({\tilde{f}}(z) \le ((\beta -1)/\beta ) f(y) + (1/\beta ) f(\mathbf{0}) < +\,\infty \). Note that

$$\begin{aligned} z_{i} = \left\{ \begin{array}{ll} (\beta - 1) + \mu _{i}^{+} - \mu _{i}^{+}/\beta &{}\quad (i \in A^{+}), \\ (\beta - 1) - \mu _{i}^{-} + \mu _{i}^{-}/\beta &{}\quad (i \in A^{-}), \\ (\beta - 1) &{}\quad (i \in A^{=}), \\ \mu _{i}^{\circ } - \mu _{i}^{\circ }/\beta &{}\quad (i \in A^{\circ }), \\ 0 &{}\quad (\text{ otherwise }). \end{array} \right. \end{aligned}$$

If \(A^{+} \cup A^{-} \cup A^{\circ } = \emptyset \), we are done with a contradiction. Indeed, we then have \(z = (\alpha + \lambda -1 ) \varvec{1}_{A}\) and \(y = (\alpha + \lambda ) \varvec{1}_{A}\), and hence \(z \le y\), \(z \not = y\), and \(z \in \mathrm{dom\,}f\) by \(f(z)={\tilde{f}}(z) < +\,\infty \). In the following we assume \(A^{+} \cup A^{-} \cup A^{\circ } \not = \emptyset \), which implies \(z \not \in {{\mathbb {Z}}}^{n}\).

The integral neighborhood N(z) of z consists of all points \(y'\) that can be represented as

$$\begin{aligned} y'&= (\beta -1)\varvec{1}_{A} + \left( \sum _{i \in A^{+} {\setminus } D} \mu _{i}^{+} \varvec{1}_{i} + \sum _{i \in A^{+} \cap D} (\mu _{i}^{+}-1) \varvec{1}_{i} \right) \nonumber \\&\quad - \left( \sum _{i \in A^{-} {\setminus } D} \mu _{i}^{-} \varvec{1}_{i} + \sum _{i \in A^{-} \cap D} (\mu _{i}^{-}-1) \varvec{1}_{i} \right) + \left( \sum _{i \in A^{\circ } {\setminus } D} \mu _{i}^{\circ } \varvec{1}_{i} + \sum _{i \in A^{\circ } \cap D} (\mu _{i}^{\circ }-1) \varvec{1}_{i} \right) \end{aligned}$$

for a subset D of \(A^{+} \cup A^{-} \cup A^{\circ }\). Since \({\tilde{f}}(z) < +\,\infty \) and \(z \not \in {{\mathbb {Z}}}^{n}\), we must have \(|N(z) \cap \mathrm{dom\,}f| \ge 2\), which implies that there exists a nonempty D for which \(y' \in \mathrm{dom\,}f\). Take any \(y' \in N(z) \cap \mathrm{dom\,}f\) with \(D \not = \emptyset \). Then \(y' \le y\) and \(y' \ne y\), since \(A^{-} \not = A\) and

$$\begin{aligned} y' - y = - \varvec{1}_{A} - \sum _{i \in A^{+} \cap D} \varvec{1}_{i} + \sum _{i \in A^{-} \cap D} \varvec{1}_{i} - \sum _{i \in A^{\circ } \cap D} \varvec{1}_{i} \le \mathbf{0}. \end{aligned}$$

We also have \(y' \in (\alpha \varvec{1}_{A} + C_{A})\) by an alternative expression of \(y'\):

$$\begin{aligned} y'&= \alpha \varvec{1}_{A} + \left( \sum _{i \in A^{+} {\setminus } D} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A^{+} \cap D} (\mu _{i}^{+}-1)(\varvec{1}_{A} + \varvec{1}_{i}) \right) \nonumber \\&\quad + \left( \sum _{i \in A^{-} {\setminus } D} \mu _{i}^{-}(\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in A^{-} \cap D} (\mu _{i}^{-}-1)(\varvec{1}_{A} - \varvec{1}_{i}) \right) \nonumber \\&\quad + \left( \sum _{i \in A^{\circ } {\setminus } D} \mu _{i}^{\circ }(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A^{\circ } \cap D} (\mu _{i}^{\circ }-1)(\varvec{1}_{A} + \varvec{1}_{i}) \right) + (\lambda +|D| -1)\varvec{1}_{A}. \end{aligned}$$

Hence \(y' \in (\alpha \varvec{1}_{A} + C_{A}) \cap (\mathrm{dom\,}f)\), a contradiction to the minimality of y. \(\square \)

We are now in the position to prove Proposition 5.4. To prove the contrapositive of the claim, suppose that \(y \in \alpha \varvec{1}_{A} + C_{A}\) for some A. Then y can be expressed as

$$\begin{aligned} y = \alpha \varvec{1}_{A} + \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^{-} (\varvec{1}_{A} - \varvec{1}_{i}) + \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A}+ \varvec{1}_{i}) + \lambda \varvec{1}_{A} \end{aligned}$$

for some \(\mu _{i}^{+},\mu _{i}^{-}, \mu _{i}^{\circ }, \lambda \in {{\mathbb {Z}}}_{+}\). Equivalently,

$$\begin{aligned} y= & {} ((\alpha -1) \varvec{1}_{A} + \varvec{1}_{A}) + \sum _{i \in A} \mu _{i}^{+}(\varvec{1}_{A} + \varvec{1}_{i}) + \sum _{i \in A} \mu _{i}^{-} (\varvec{1}_{A} - \varvec{1}_{i})\\&+ \sum _{i \in N {\setminus } A} \mu _{i}^{\circ } (\varvec{1}_{A} + \,\varvec{1}_{i}) + \lambda \varvec{1}_{A}, \end{aligned}$$

which corresponds to the right-hand side of (5.11) with \(x = (\alpha -1) \varvec{1}_{A}\). By Lemma 5.8, we have \(\alpha \varvec{1}_{A} \in \mathrm{dom\,}f\). Since \(x = (\alpha -1) \varvec{1}_{A} \preceq _{f} \alpha \varvec{1}_{A} = x + \varvec{1}_{A}\) by Lemma 5.5, Lemma 5.7 shows that y is not f-minimal. This completes the proof of Proposition 5.4.

5.3 Proof of Proposition 5.2 for \(n=2\)

In this section we prove Proposition 5.2 for \(n=2\) as an illustration of the proof method using the tools introduced in Sect. 5.2. This also gives an alternative proof of Theorem 4.1.

Recall that \(\mathrm{dom\,}f\) is assumed to be a bounded subset of \({{\mathbb {Z}}}_{+}^{2}\), which implies, in particular, that \(\arg \min f \not = \emptyset \). Take \(x^{*}=(x_{1}^{*},x_{2}^{*}) \in \arg \min f\) that is f-minimal. We may assume \(x_{1}^{*} \ge x_{2}^{*}\) by Proposition 2.3 (2). Since \(x^{*}\) is f-minimal, Proposition 5.4 shows that \(x^{*}\) belongs to

$$\begin{aligned} X^{*} = \{ (x_{1},x_{2}) \in {{\mathbb {Z}}}_{+}^{2} \mid x_{1} \ge x_{2} \} {\setminus } \left( (\alpha \varvec{1}_{A} + C_{A}) \cup (\alpha \varvec{1}_{N} + C_{N}) \right) , \end{aligned}$$

where \(A = \{ 1 \}\) and \(N = \{ 1, 2 \}\). On noting

$$\begin{aligned} C_{A}&= \{ \mu _{1} (1,0) + \mu _{12} (1,1) \mid \mu _{1},\mu _{12} \in {{\mathbb {Z}}}_{+} \}, \\ C_{N}&= \{ \mu _{1} (1,0) + \mu _{2} (0,1) \mid \mu _{1},\mu _{2} \in {{\mathbb {Z}}}_{+} \}, \end{aligned}$$

we see that \(X^{*}\) consists of all integer points contained in the parallelogram with vertices (0, 0), \((\alpha -1, 0)\), \((2\alpha -2, \alpha -1)\), \((\alpha -1, \alpha -1)\). Therefore, \( \Vert x^{*}\Vert _{\infty } \le 2(\alpha -1)\). Thus Proposition 5.2 for \(n=2\) is proved.

5.4 Proof of Proposition 5.2 for \(n \ge 3\)

In this section we prove Proposition 5.2 for \(n \ge 3\) by induction on n. Accordingly we assume that Proposition 5.2 is true for every integrally convex function in \(n-1\) variables.

Let \(f: {\mathbb {Z}}^{n} \rightarrow {\mathbb {R}} \cup \{ +\,\infty \}\) be an integrally convex function such that \(\mathrm{dom\,}f\) is a bounded subset of  \({{\mathbb {Z}}}_{+}^{n}\) containing the origin \(\varvec{0}\). Note that \(\arg \min f \not = \emptyset \) and take \(x^{*}=(x_{1}^{*},x_{2}^{*},\ldots ,x_{n}^{*}) \in \arg \min f\) that is f-minimal. Then

$$\begin{aligned}{}[\mathbf{0},x^{*}]_{{{\mathbb {Z}}}} \cap \arg \min f = \{x^{*}\}. \end{aligned}$$
(5.15)

We may assume

$$\begin{aligned} x_{1}^{*} \ge x_{2}^{*} \ge \cdots \ge x_{n}^{*} \end{aligned}$$
(5.16)

by Proposition 2.3 (2).

The following lemma reveals a significant property of integrally convex functions that will be used here for induction on n. Note that, by (5.15), \(x^{*}\) satisfies the condition imposed on \(x^{\bullet }\).

Lemma 5.9

Let \(x^{\bullet } \in \mathrm{dom\,}f\) be an f-minimal point. Then for any \(i \in N\) with \(x^{\bullet }_i \ge 1\) there exists an f-minimal point \(x^{\circ } \in \mathrm{dom\,}f\) such that

$$\begin{aligned} \mathbf{0}\le x^{\circ } \le x^{\bullet }, \quad \Vert x^{\circ }-x^{\bullet } \Vert _\infty = x^{\bullet }_{i} - x^{\circ }_{i} = 1. \end{aligned}$$

Proof

Let \(x^{\circ }\) be a minimizer of f(x) among those x which belong to \(X = \{ x \in {{\mathbb {Z}}}^{n} \mid \mathbf{0}\le x \le x^{\bullet }, \ \Vert x-x^{\bullet } \Vert _\infty = 1, \ x_{i} = x^{\bullet }_{i} -1 \}\); in case of multiple minimizers, we choose a minimal minimizer with respect to the vector ordering (componentwise ordering). To prove f-minimality of \(x^{\circ }\), suppose, to the contrary, that there exists \(z \in [\mathbf{0},x^{\circ }]_{{{\mathbb {Z}}}} {\setminus } \{x^{\circ }\}\) with \(f(z) \le f(x^{\circ })\). We have \(\ell = \Vert z-x^{\bullet } \Vert _\infty \ge 2\), since otherwise \(z \in X\) and this contradicts the minimality of \(x^{\circ }\).

Consider \(y \in {{\mathbb {R}}}_{+}^{n}\) defined by

$$\begin{aligned} y = \frac{\ell -1}{\ell } x^{\bullet } + \frac{1}{\ell }z. \end{aligned}$$
(5.17)

The value of the local convex extension \({\tilde{f}}\) of f at y can be represented as

$$\begin{aligned} {\tilde{f}}(y) = \sum _{y^{j} \in Y}\lambda _{j} f(y^{j}) \end{aligned}$$

with some set \(Y \subseteq N(y) \cap \mathrm{dom\,}f\) and positive coefficients \(\lambda _{j}\) such that

$$\begin{aligned} y = \sum _{y^{j} \in Y}\lambda _{j} y^{j}, \qquad \sum _{y^{j} \in Y} \lambda _{j} = 1. \end{aligned}$$
(5.18)

Since \(\Vert y-x^{\bullet } \Vert _\infty = 1\) and \(y \le x^{\bullet }\) by (5.17), either \(y^{j}_{i} = x^{\bullet }_{i}-1\) or \(y^{j}_{i} = x^{\bullet }_{i} \) holds for each \(y^{j} \in Y\). Define

$$\begin{aligned} Y^{<} = \left\{ y^{j} \in Y \mid y^{j}_{i} = x^{\bullet }_{i}-1 \right\} , \qquad Y^{=} = \left\{ y^{j} \in Y \mid y^{j}_{i} = x^{\bullet }_{i} \right\} . \end{aligned}$$

Then we see

$$\begin{aligned} \sum _{y^{j} \in Y^{<}} \lambda _{j} = \frac{x^{\bullet }_{i}-z_{i}}{\ell }, \qquad \sum _{y^{j} \in Y^{=}} \lambda _{j} = 1- \frac{x^{\bullet }_{i}-z_{i}}{\ell } \end{aligned}$$
(5.19)

from

$$\begin{aligned} y_{i} = \frac{\ell -1}{\ell } x^{\bullet }_{i} + \frac{1}{\ell }z_{i} = x^{\bullet }_{i} - \frac{x^{\bullet }_{i}-z_{i}}{\ell }. \end{aligned}$$

On the other hand, we have

$$\begin{aligned} {\tilde{f}}(y) \le \frac{\ell -1}{\ell } f(x^{\bullet }) + \frac{1}{\ell }f(z) \end{aligned}$$
(5.20)

by integral convexity of f. We divide into cases to derive a contradiction to this inequality.

Case 1 (\(x^{\bullet }_{i}-z_{i} = \ell \)): We have \(Y = Y^{<}\) by (5.18) and (5.19) and then \(f(x^{\circ }) \le f(y^{j})\) for all \(y^{j} \in Y\) by the definition of \(x^{\circ }\). Hence

$$\begin{aligned} f(x^{\circ }) \le \sum _{y^{j} \in Y}\lambda _{j} f(y^{j}) = {\tilde{f}}(y). \end{aligned}$$
(5.21)

For the right-hand side of (5.20), note first that the f-minimality of \(x^{\bullet }\) and \(x^{\circ } \in [\mathbf{0},x^{\bullet }]_{{{\mathbb {Z}}}} {\setminus } \{x^{\bullet }\}\) imply \(f(x^{\bullet }) < f(x^{\circ })\). Then it follows from \(f(x^{\bullet }) < f(x^{\circ })\) and \(f(z) \le f(x^{\circ })\) that

$$\begin{aligned} \frac{\ell -1}{\ell } f(x^{\bullet }) + \frac{1}{\ell }f(z) < f(x^{\circ }). \end{aligned}$$
(5.22)

But (5.21) and (5.22) together contradict (5.20).

Case 2 (\(x^{\bullet }_{i}-z_{i} < \ell \)): In this case \(Y^{=}\) is nonempty. Since \(x^{\bullet } \not \in Y\) by \(\Vert x^{\bullet } - y \Vert _\infty = 1\), every \(y^{j} \in Y\) is distinct from \(x^{\bullet }\), whereas \(y^{j} \in [\mathbf{0},x^{\bullet }]_{{{\mathbb {Z}}}}\). Then the assumed f-minimality of \(x^{\bullet }\) implies

$$\begin{aligned} f(y^{j}) > f(x^{\bullet }) \qquad (\forall \, y^{j} \in Y=Y^{=} \cup Y^{<}). \end{aligned}$$
(5.23)

We also have

$$\begin{aligned} f(y^{j}) \ge f(x^{\circ }) \ge f(z) \qquad (\forall \, y^{j} \in Y^{<}), \end{aligned}$$
(5.24)

which is obvious from the definitions of \(x^{\circ }\) and z. Then we have

figure b

This is a contradiction to (5.20). \(\square \)

Lemma 5.9 can be applied repeatedly, since the resulting point \(x^{\circ }\) satisfies the condition imposed on the initial point \(x^{\bullet }\). Starting with \(x^{\bullet } = x^{*}\) we apply Lemma 5.9 repeatedly with \(i = n\). After \(x_{n}^{*}\) applications, we arrive at a point \({\hat{x}} = ({\hat{x}}_{1},{\hat{x}}_{2},\ldots ,{\hat{x}}_{n-1},0)\). This point \({\hat{x}}\) is f-minimal and

$$\begin{aligned} x_{j}^{*} - x_{n}^{*} \le {\hat{x}}_{j} \qquad (j=1,2,\ldots ,n-1). \end{aligned}$$
(5.25)

We now consider a function \({\hat{f}}: {{\mathbb {Z}}}^{n-1} \rightarrow {{\mathbb {R}}}\cup \{+\,\infty \}\) defined by

$$\begin{aligned} {\hat{f}}(x_{1},x_{2},\ldots ,x_{n-1}) = \left\{ \begin{array}{ll} f(x_{1},x_{2},\ldots ,x_{n-1},0) &{}\quad \left( 0 \le x_{j} \le {\hat{x}}_{j} \ (j=1,2,\ldots ,n-1) \right) , \\ +\,\infty &{}\quad \text{(otherwise) }. \end{array} \right. \end{aligned}$$

This function \({\hat{f}}\) is an integrally convex function in \(n-1\) variables, and the origin \(\mathbf{0}\) is \(\alpha \)-local minimal for \({\hat{f}}\). By the induction hypothesis, we can apply Proposition 5.2 to \({\hat{f}}\) to obtain

$$\begin{aligned} \Vert {\hat{x}} \Vert _{\infty } \le \beta _{n-1} (\alpha - 1). \end{aligned}$$
(5.26)

Note that \({\hat{x}}\) is the unique minimizer of \({\hat{f}}\).

Combining (5.25) and (5.26) we obtain

$$\begin{aligned} x_{1}^{*} - x_{n}^{*} \le \beta _{n-1} (\alpha - 1). \end{aligned}$$
(5.27)

We also have

$$\begin{aligned} x_{n}^{*} \le \frac{n-1}{n+1} x_{1}^{*} + \frac{2(\alpha - 1)}{n+1} \end{aligned}$$
(5.28)

as a consequence of f-minimality of \(x^{*}\); see Lemma 5.10 below. It follows from (5.27) and (5.28) that

$$\begin{aligned} x_{1}^{*} \ \le \ x_{n}^{*} + \beta _{n-1} (\alpha - 1) \ \le \ \frac{n-1}{n+1} x_{1}^{*} + \frac{2(\alpha - 1)}{n+1} + \beta _{n-1} (\alpha - 1). \end{aligned}$$

This implies

$$\begin{aligned} x_{1}^{*} \le \left( \frac{n+1}{2} \beta _{n-1} + 1 \right) (\alpha - 1) = \beta _{n} (\alpha - 1), \end{aligned}$$

where the recurrence relation

$$\begin{aligned} \beta _{n} = \frac{n+1}{2} \beta _{n-1} + 1 \end{aligned}$$

is used.

It remains to derive inequality (5.28) from f-minimality of \(x^{*}\).

Lemma 5.10

The following inequalities hold for \(x^{*}\) and \(\alpha \).

  1. (1)
    $$\begin{aligned} \sum _{i=1}^{n-1}\left( x_{i}^{*} - x_{n}^{*}\right) \ge x_{n}^{*} - \alpha + 1. \end{aligned}$$
    (5.29)
  2. (2)
    $$\begin{aligned} \sum _{i=2}^{n}\left( x_{1}^{*} - x_{i}^{*}\right) \ge x_{1}^{*} - \alpha + 1. \end{aligned}$$
    (5.30)
  3. (3)
    $$\begin{aligned} x_{n}^{*} \le \frac{n-1}{n+1} x_{1}^{*} + \frac{2(\alpha - 1)}{n+1}. \end{aligned}$$
    (5.31)

Proof

(1) To prove by contradiction, suppose that \(\sum _{i=1}^{n-1}(x_{i}^{*} - x_{n}^{*}) \le x_{n}^{*} - \alpha \). Then the expression

$$\begin{aligned} x^{*} = \alpha \varvec{1}_{N} + \sum _{i=1}^{n-1}\left( x_{i}^{*} - x_{n}^{*}\right) \left( \varvec{1}_{N}+\varvec{1}_{i}\right) + \left( x_{n}^{*}-\alpha -\sum _{i=1}^{n-1}\left( x_{i}^{*} - x_{n}^{*}\right) \right) \varvec{1}_{N} \end{aligned}$$

shows \(x^{*} \in \alpha \varvec{1}_{N} + C_{N}\). By Proposition 5.4, this contradicts the fact that \(x^{*}\) is f-minimal.

(2) To prove by contradiction, suppose that \(\displaystyle \sum _{i=2}^{n}(x_{1}^{*} - x_{i}^{*}) \le x_{1}^{*} - \alpha \). Then the expression

$$\begin{aligned} x^{*} = \alpha \varvec{1}_{N} + \sum _{i=2}^{n}\left( x_{1}^{*} - x_{i}^{*}\right) \left( \varvec{1}_{N}-\varvec{1}_{i}\right) + \left( x_{1}^{*}-\alpha -\sum _{i=2}^{n}\left( x_{1}^{*} - x_{i}^{*}\right) \right) \varvec{1}_{N} \end{aligned}$$

shows \(x^{*} \in \alpha \varvec{1}_{N} + C_{N}\). By Proposition 5.4, this contradicts the fact that \(x^{*}\) is f-minimal.

(3) Since

$$\begin{aligned} \sum _{i=1}^{n-1}\left( x_{i}^{*} - x_{n}^{*}\right) + \sum _{i=2}^{n}(x_{1}^{*} - x_{i}^{*}) = n \left( x_{1}^{*} - x_{n}^{*}\right) , \end{aligned}$$

the addition of (5.29) and (5.30) yields

$$\begin{aligned} n \left( x_{1}^{*} - x_{n}^{*}\right) \ge x_{1}^{*} +x_{n}^{*} - 2(\alpha - 1) , \end{aligned}$$

which is equivalent to (5.31). \(\square \)

This completes the proof of Proposition 5.2, and hence that of Theorem 5.1 (1).

5.5 Estimation of \(\beta _{n}\)

The estimate of \(\beta _{n}\) given in Theorem 5.1 (2) is derived in this section.

The recurrence relation (5.3) can be rewritten as

$$\begin{aligned} \frac{2^{n}}{(n+1)!} \beta _{n} = \frac{2^{n-1}}{n!} \beta _{n-1} + \frac{2^{n}}{(n+1)!}, \end{aligned}$$

from which follows

$$\begin{aligned} \frac{2^{n}}{(n+1)!} \beta _{n}&= \frac{2^{2}}{3!} \beta _{2} + \sum _{k=3}^{n}\frac{2^{k}}{(k+1)!} = \frac{4}{3} + \sum _{k=3}^{n}\frac{2^{k}}{(k+1)!}. \end{aligned}$$
(5.32)

For the last term we have

$$\begin{aligned} \sum _{k=3}^{n}\frac{2^{k}}{(k+1)!}&\le \sum _{k=3}^{7}\frac{2^{k}}{(k+1)!} + \sum _{k=8}^{\infty }\frac{2^{k}}{(k+1)!} \le \frac{167}{315} \end{aligned}$$
(5.33)

since

$$\begin{aligned} \sum _{k=3}^{7}\frac{2^{k}}{(k+1)!}&= \frac{2^{3}}{4!} + \frac{2^{4}}{5!} + \frac{2^{5}}{6!} + \frac{2^{6}}{7!} + \frac{2^{7}}{8!} = \frac{166}{315} \ ( \approx 0.53), \\ \sum _{k=8}^{\infty }\frac{2^{k}}{(k+1)!}&\le \frac{2^{8}}{9!} \sum _{k=1}^{\infty }\left( \frac{2}{10} \right) ^{k-1} = \frac{320}{9!} \ (\approx 8.8 \times 10^{-4} ) < \frac{1}{315}. \end{aligned}$$

Substitution of (5.33) into (5.32) yields

$$\begin{aligned} \beta _{n}&\le \left( \frac{4}{3} + \frac{167}{315} \right) \frac{(n+1)!}{2^{n}} = \frac{587}{315} \times \frac{(n+1)!}{2^{n}} \le \frac{(n+1)!}{2^{n-1}}. \end{aligned}$$

Thus the upper bound (5.4) is proved.

6 Optimization of integrally convex functions

In spite of the facts that the factor \(\beta _{n}\) of the proximity bound is superexponential in n and that integral convexity is not stable under scaling, we can design a proximity-scaling type algorithm for minimizing integrally convex functions with bounded effective domains. The algorithm runs in \(C(n) \log _{2} K_{\infty }\) time for some constant C(n) depending only on n, where \(K_{\infty }\)\((>0)\) denotes the \(\ell _{\infty }\)-size of the effective domain. This means that, if the dimension n is fixed and treated as a constant, the algorithm is polynomial in the problem size. Note that no algorithm for integrally convex function minimization can be polynomial in n, since any function on the unit cube \(\{ 0,1 \}^{n}\) is integrally convex.

The proposed algorithm is a modification of the generic proximity-scaling algorithm given in the Introduction. In Step S1, we replace the function \({{\tilde{f}}}(y) = f(x + \alpha y)\) with its restriction to the discrete rectangle \(\{ y \in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} (2\alpha - 1) \}\), which is denoted by \({{\hat{f}}}(y)\). Then a local minimizer of \({{\hat{f}}}(y)\) is found to update x to \(x+ \alpha y\). Note that a local minimizer of \({{\hat{f}}}(y)\) can be found, e.g., by any descent method (the steepest descent method, in particular).

figure c
figure d

The correctness of the algorithm can be shown as follows. We first assume that f has a unique (global) minimizer \(x^{*}\). Let \(x^{2\alpha }\) denote the vector x at the beginning of Step S1, and define

$$\begin{aligned} f^{(\alpha )}(x)&= \left\{ \begin{array}{ll} f(x) &{}\quad (\Vert x - x^{2\alpha } \Vert _{\infty } \le \beta _{n} (2\alpha - 1) ), \\ +\,\infty &{}\quad (\text{ otherwise }), \\ \end{array}\right. \\ {{\hat{f}}}^{(\alpha )}(y)&= \left\{ \begin{array}{ll} f(x^{2\alpha } + \alpha y) &{}\quad (\Vert \alpha y \Vert _{\infty } \le \beta _{n} (2\alpha - 1) ), \\ +\,\infty &{}\quad (\text{ otherwise }). \\ \end{array}\right. \end{aligned}$$

Note that \(f^{(\alpha )}\) is integrally convex, whereas \({{\hat{f}}}^{(\alpha )}\) is not necessarily so. Let \({{\hat{y}}}^{\alpha }\) be the output of Step S1 and \(x^{\alpha } = x^{2\alpha } + \alpha {{\hat{y}}}^{\alpha }\). Then \({{\hat{y}}}^{\alpha }\) is a local minimizer of \({{\hat{f}}}^{(\alpha )}\) and \(x^{\alpha } - x^{2\alpha } = \alpha {{\hat{y}}}^{\alpha } \in (\alpha {\mathbb {Z}})^{n}\).

Lemma 6.1

\(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) for all \(\alpha \).

Proof

This is obviously true in the initial phase with \(\alpha = 2^{\lceil \log _{2} K_{\infty } \rceil }\). To prove \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) by induction on descending \(\alpha \), we show that \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) implies \(x^{*} \in \mathrm{dom\,}f^{(\alpha /2)}\). Since \(x^{*} \in \mathrm{dom\,}f^{(\alpha )}\) and \(x^{*} \in \arg \min f\), we have \(x^{*} \in \arg \min f^{(\alpha )}\). On the other hand, \(x^{\alpha }\) is an \(\alpha \)-local minimizer of \(f^{(\alpha )}\), since \({{\hat{y}}}^{\alpha }\) is a local minimizer of \({{\hat{f}}}^{(\alpha )}\). Then, by the proximity theorem (Theorem 5.1) for \(f^{(\alpha )}\), we obtain \(\Vert x^{\alpha } - x^{*} \Vert _{\infty } \le \beta _{n} (\alpha - 1)\), which shows \(x^{*} \in \mathrm{dom\,}f^{(\alpha /2)}\). \(\square \)

In the final phase with \(\alpha = 1\), \(f^{(\alpha )}\) is an integrally convex function, and hence, by Theorem 2.5, an \(\alpha \)-local minimizer of \(f^{(\alpha )}\) is a global minimizer of \(f^{(\alpha )}\). This observation, with Lemma 6.1 above, shows that the output of the algorithm is a global minimizer of f.

The complexity of the algorithm can be analyzed as follows. The number of iterations in the descent method is bounded by the total number of points in \(Y = \{ y\in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} (2\alpha -1) \}\), which is bounded by \((4\beta _{n})^{n}\). For each y we examine all of its \(3^{n}\) neighboring points to find a descent direction or verify its local minimality. Thus Step S1, which updates \(x^{2\alpha }\) to \(x^{\alpha }\), can be done with at most \((12\beta _{n})^{n}\) function evaluations. The number of scaling phases is \(\log _{2} K_{\infty }\). Therefore, the time complexity (or the number of function evaluations) is bounded by \((12\beta _{n})^{n}\log _{2} K_{\infty }\). For a fixed n, this gives a polynomial bound \(O(\log _{2} K_{\infty })\) in the problem size.

Finally, we describe how to get rid of the uniqueness assumption of the minimizer. Consider a perturbed function \(f_{\varepsilon }(x) = f(x) + \sum _{i=1}^{n} \varepsilon ^{i} x_{i}\) with a sufficiently small \(\varepsilon >0\). By the assumed boundedness of the effective domain of f, the perturbed function has a minimizer, which is unique as a result of the perturbation. To find the minimum of \(f_{\varepsilon }\) it is not necessary to explicitly introduce parameter \(\varepsilon \) into the algorithm, but a lexicographically smallest local minimizer y of \({{\hat{f}}}(y)\) should be found in Step S1.

Remark 6.1

Some technical points are explained here. By working with \(f^{(\alpha )}\), we can bound the number of iterations for finding an \(\alpha \)-local minimizer in terms of the number of integer vectors contained in \(\mathrm{dom\,}{{\hat{f}}}^{(\alpha )}\). The vector \(x^{\alpha }\) is an \(\alpha \)-local minimizer for \(f^{(\alpha )}\), but not necessarily for the original function f. This is why we apply the proximity theorem to \(f^{(\alpha )}\) in the proof of Lemma 6.1.

Remark 6.2

The proximity bound \(\beta _{n} (\alpha - 1)\) in Theorem 5.1 is linear in \(\alpha \). This linear dependence on \(\alpha \) is critical for the complexity \(O(\log _{2} K_{\infty })\) of the algorithm when n is fixed. Suppose, for example, that the proximity bound is \(\beta _{n} (\alpha ^{m} - 1)\) for some \(m >1\). Then in the above analysis, \((2\alpha -1)\) should be replaced by \(((2\alpha )^{m} - 1)\), and the total number of points in \(Y = \{ y\in {\mathbb {Z}}^{n} \mid \Vert \alpha y \Vert _{\infty } \le \beta _{n} ((2\alpha )^{m} -1) \}\) is bounded by \((2^{m+1}\beta _{n})^{n} \alpha ^{(m-1)n}\). The sum of \(\alpha ^{(m-1)n}\) over \(\alpha = 1, 2, 2^{2},\ldots , 2^{\lceil \log _{2} K_{\infty } \rceil }\) is of the order of \(K_{\infty }^{(m-1)n}\). Then the proposed algorithm will not be polynomial in \(\log _{2} K_{\infty }\). Thus the particular form \(\beta _{n} (\alpha - 1)\) of our proximity bound is important for our algorithm.

7 Concluding remarks

As shown in this paper, the nice properties of \(\mathrm{L}^{\natural }\)-convex functions such as stability under scaling and the proximity bound \(n(\alpha -1)\) are not shared by integrally convex functions in general. Two subclasses of integrally convex functions which still enjoy these nice properties have been introduced in [19] based on discrete midpoint convexity (1.1) for every pair \((x, y) \in {{\mathbb {Z}}}^{n} \times {{\mathbb {Z}}}^{n}\) with \(\Vert x - y \Vert _{\infty } \ge 2\) or \(\Vert x - y \Vert _{\infty } = 2\). Both classes of such functions are superclasses of \(\mathrm{L}^{\natural }\)-convex functions, subclasses of integrally convex functions, and closed under scaling for all n and admit a proximity theorem with the bound \(n (\alpha -1)\) for all n. See [19] for details.