Generalized Semimodularity: Order Statistics

Pinelis, Iosif

doi:10.1007/978-3-030-26391-1_8

Iosif Pinelis⁹

Part of the book series: Progress in Probability ((PRPR,volume 74))

591 Accesses

Abstract

A notion of generalized n-semimodularity is introduced, which extends that of (sub/super)modularity in four ways at once. The main result of this paper, stating that every generalized -semimodular function on the nth Cartesian power of a distributive lattice is generalized n-semimodular, may be considered a multi/infinite-dimensional analogue of the well-known Muirhead lemma in the theory of Schur majorization. This result is also similar to a discretized version of the well-known theorem due to Lorentz, which latter was given only for additive-type functions. Illustrations of our main result are presented for counts of combinations of faces of a polytope; one-sided potentials; multiadditive forms, including multilinear ones—in particular, permanents of rectangular matrices and elementary symmetric functions; and association inequalities for order statistics. Based on an extension of the FKG inequality due to Rinott & Saks and Aharoni & Keich, applications to correlation inequalities for order statistics are given as well.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Slepian’s Inequality, Modularity and Integral Orderings

The fundamental theorem of finite semidistributive lattices

Article 27 June 2021

Characterizing the supernorm partition statistic

Article Open access 15 July 2023

Keywords

2010 Mathematics Subject Classification

8.1 Summary and Discussion

As pointed out e.g. in [3, 4], the notion of submodularity has become useful in various areas: combinatorial optimization, with many applications in operations research; machine learning; computer vision; electrical networks; signal processing; several areas of theoretical computer science, such as matroid theory; economics. One may also note the use of this notion in potential theory [6], as a capacity is a submodular function.

Let L be any distributive lattice; for definitions and facts pertaining to lattices, see e.g. [10].

A function $\lambda \colon L\to \mathbb {R}$ is called submodular if

$$\displaystyle \begin{aligned} \lambda(f)+\lambda(g)\ge\lambda(f\vee g)+\lambda(f\wedge g) \end{aligned} $$

(8.1)

for all f and g in L. A function λ is called supermodular if the function − λ is submodular, and λ is called modular if it is both submodular and supermodular. See e.g. [4, 9, 18, 19, 24, 25]. Let us say that a function μ is log-submodular if $\ln \mu $ is submodular. The log-submodularity condition and the corresponding log-supermodularity condition were referred to in Karlin and Rinott [13, 14] as the multivariate total positivity of order 2 (MTP₂) and the multivariate reverse rule of order 2 (MRR₂), respectively. As noted by Choquet [6, §14.3], a nondecreasing function λ is alternating of order 2 iff it satisfies inequality (8.1), that is, λ is submodular; it was also shown in [6] that the classical Newtonian capacity is such a function.

The log-supermodularity condition is the condition under which the famous Fortuin–Kasteleyn–Ginibre (FKG) correlation inequality [8] holds. Therefore, using inequality (8.17) together with the FKG inequality and its generalizations, we will be able to obtain the corresponding applications, in Corollaries 8.2.11 and 8.2.12.

More generally, let $\mathcal {R}$ be any set, endowed with a transitive relation ⋊⋉ , so that for any a, b, c in $\mathcal {R}$ one has the implication a⋊⋉ b & b⋊⋉ c ⇒ a⋊⋉ c. For any natural n, let us say that a function $\Lambda \colon L^n\to \mathcal {R}$ is generalized n-semimodular if

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n)\mathbin{\Join}\Lambda(f_{n:1},\dots,f_{n:n}) \end{aligned}$$

for all f = (f ₁, …, f _n) ∈ L ⁿ, where f _n:1, …, f _n:n are the “order statistics” for f defined by the formula

$$\displaystyle \begin{aligned} f_{n:j}=\bigwedge\Big\{\bigvee_{i\in J}f_i\colon J\in\binom{[n]}j\Big\} \end{aligned} $$

(8.2)

for $j\in [n]:=\overline {1,n}$, with $\binom {[n]}j$ denoting the set of all subsets J of the set [n] such that the cardinality of J is j. Here and in the sequel we use the notation $\overline {\alpha ,\beta }:=\{j\in \mathbb {Z}\colon \alpha \le j\le \beta \}$. In particular, f _n:1 = f ₁ ∧⋯ ∧ f _n and f _n:n = f ₁ ∨⋯ ∨ f _n.

For any function $\lambda \colon L\to \mathbb {R}$, let the function $\Lambda _\lambda \colon L^2\to \mathbb {R}$ be given by the formula Λ_λ(f, g) := λ(f) + λ(g) for f and g in L. Then, obviously, λ is submodular or supermodular or modular if and only if Λ_λ is generalized 2-semimodular with the relation “⋊⋉ ” being “≥” or “≤” or “= ”, respectively.

Thus, the notion of generalized n-semimodularity extends that of (sub/super)modularity in four ways at once: (1) the function Λ may be a function of any natural number n of arguments, whereas λ is a function of only one argument; (2) in contrast with a general form of dependence of Λ(f ₁, …, f _n) on f ₁, …, f _n, the function Λ_λ of two arguments is of the special form, linear in λ(f) and λ(g); (3) whereas the values of λ are real numbers, those of Λ may be in any set $\mathcal {R}$; and (iv) we now have an arbitrary transitive relation ⋊⋉ over $\mathcal {R}$ instead of one of the three particular relations “≥” or “≤” or “= ” over $\mathbb {R}$.

For any k ∈ [n], let us say that a function $\Lambda \colon L^n\to \mathbb {R}$ is generalized -semimodular if for each $j\in \overline {0,{n-k}}$ and each (n − k)-tuple $(f_i\colon i\in [n]\,\setminus \,\overline {{j+1},{j+k}})\in L^{n-k}$ the function L ^k ∋ (f _j+1, …, f _j+k)↦ Λ(f ₁, …, f _n) is generalized k-semimodular. In particular, Λ is generalized -semimodular if and only if it is generalized n-semimodular.

Whenever the relation “⋊⋉ ” is denoted as “≥” or “≤” or “= ”, let us replace “semi” in the above definitions by “sub”, “super”, and “”, respectively. For instance, “generalized n-modular” will stand for “generalized n-semimodular” with the relation “⋊⋉ ” being “= ”.

The main result of this note is

Theorem 8.1.1

Again, let L be any distributive lattice. If a function $\Lambda \colon L^n\to \mathcal {R}$ is generalized -semimodular, then it is generalized n-semimodular.

The necessary proofs will be given in Sect. 8.3.

As will be seen from the proof of Theorem 8.1.1, the condition that the function Λ be generalized -semimodular can be relaxed to the following: for each $j\in \overline {1,{n-1}}$ and each f = (f ₁, …, f _n) ∈ L ⁿ such that f ₁ ≤⋯ ≤ f _j, one has L(f ₁, …, f _n)⋊⋉ L(f ₁, …, f _j−1, f _j ∧ f _j+1, f _j ∨ f _j+1, f _j+2, …, f _n).

Remark 8.1.2

Theorem 8.1.1 will not hold in general if the lattice L is not assumed to be distributive. For instance, let L be defined by the set [5] = {1, 2, 3, 4, 5} with the partial order being the subset of the natural order ≤ on the set [5] with elements 2, 3, 4 now considered non-comparable with one another, so that the resulting order relation is the set {(f, f): f ∈ [5]} ∪ {(1, 2), (1, 3), (1, 4), (2, 5), (3, 5), (4, 5), (1, 5)}; then, in particular, 2 ∧ 3 = 1 and 2 ∨ 3 = 5. This lattice is one of the simplest examples of non-distributive lattices. It is isomorphic to the diamond lattice M ₃—see e.g. [10, p. 110]. Let n = 3, $\mathcal {R}=\mathbb {R}$, and define the function $\Lambda \colon L^3\to \mathbb {R}$ by the formula Λ(f ₁, f ₂, f ₃) := 12f ₁f ₂ + 3f ₂f ₃ + 5f ₁f ₃ for all f = (f ₁, f ₂, f ₃) ∈ L ³. Then one can verify directly—by a straightforward but tedious calculation consisting in checking 2 × 5³ = 250 inequalities, two inequalities for each f = (f ₁, f ₂, f ₃) ∈ [5]³— that this function Λ is generalized -submodular. However, Λ is not generalized 3- submodular, because for f = (2, 3, 4) one has (f _3:1, f _3:2, f _3:3) = (1, 5, 5) and Λ(f ₁, f ₂, f ₃) = Λ(2, 3, 4) = 148≱160 = Λ(1, 5, 5) = Λ(f _3:1, f _3:2, f _3:3). □

Remark 8.1.3

A well-known fact, which will be crucial in the proof of Theorem 8.1.1, is the representation theorem due to Birkhoff and Stone stating that any distributive lattice L is isomorphic to a lattice of subsets of (and hence to a lattice of nonnegative real-valued functions on) a certain set S, depending on L (see e.g. [10, Theorem 119]). For such a lattice of functions, the “order statistics” f _n:1, …, f _n:n are uniquely determined by the condition that

$$\displaystyle \begin{aligned} f_{n:1}(s)\le\dots\le f_{n:n}(s)\quad \text{and}\quad \{\{f_{n:1}(s),\dots,f_{n:n}(s)\}\}=\{\{f_1(s),\dots,f_n(s)\}\} \end{aligned} $$

(8.3)

for each s ∈ S, where the double braces are used to denote multisets, with appropriate multiplicities. To quickly see why this is true, one may reason as follows: Let us now use condition (8.3) to define f _n:1, …, f _n:n. Note that the value of the right-hand side (rhs) of (8.2) at any point s ∈ S is invariant with respect to all permutations of the values f ₁(s), …, f _n(s). So, the value of the rhs of (8.2) at s will not change if one replaces there f ₁, …, f _n by f _n:1, …, f _n:n, and this value will equal f _n:j(s). Thus, the definition of f _n:1, …, f _n:n by means of formula (8.3) is equivalent to the one given by (8.2), if the lattice L is already a lattice of real-valued functions on S. Moreover, it is clear now that, if the lattice L is distributive, then definition (8.2) can be rewritten in the dual form, as

$$\displaystyle \begin{aligned} f_{n:j}=\bigvee\Big\{\bigwedge_{i\in J}f_i\colon J\in\binom{[n]}{n+1-j}\Big\} \end{aligned} $$

(8.4)

for all j ∈ [n].

On the other hand, it can be seen that, if L is not distributive, then this duality can be lost and each of the definitions (8.2) and (8.4) of f _n:j can be rather unnaturally skewed up or down. For instance, in the counterexample given in Remark 8.1.2, for f = (2, 3, 4) we had (f _3:1, f _3:2, f _3:3) = (1, 5, 5) according to definition (8.2), but we would have (f _3:1, f _3:2, f _3:3) = (1, 1, 5) according to (8.4).

However, one may note that the right-hand side of (8.4) is always ≤ than that of (8.2); this follows because for any $J\in \binom {[n]}{n+1-j}$ and any $K\in \binom {[n]}j$ there is some k ∈ J ∩ K, and then ∧_{i ∈ J}f _i ≤ f _k ≤∨_{i ∈ K}f _i. □

In view of the lattice representation theorem cited in Remark 8.1.3, Theorem 8.1.1 may be considered a multi/infinite-dimensional analogue of the well-known Muirhead lemma in the theory of Schur majorization (cf. e.g. [17, Lemma 2.B.1, p. 32]), which may be stated as follows: for vectors x and y in $\mathbb {R}^n$ such that x ≺ y (that is, x is majorized by y), there exist finitely many vectors x ₀, …, x _m in $\mathbb {R}^n$ such that x = x ₀ ≺⋯ ≺ x _m = y and for each $j\in \overline {0,m-1}$ the vectors x _j and x _j+1 differ only in two coordinates. However, no direct multi-dimensional extension of the Muirhead lemma seems to exist, even in two dimensions (see e.g. [20, p. 11]).

For functions that are “infinite-dimensional” counterparts of the “m-dimensional” function $\Lambda \colon L^m\to \mathbb {R}$ given by the formula of the additive form

$$\displaystyle \begin{aligned} \Lambda(g_1,\dots,g_m)=\sum_{j=1}^m\lambda_j(g_j), \end{aligned} $$

(8.5)

Lorentz [16] obtained a result similar to Theorem 8.1.1; for readers’ convenience, let us reproduce it here: For each j ∈ [n], let $f_j^*$ denote the equimeasurable decreasing rearrangement [11] of a function $f_j\colon (0,1)\to \mathbb {R}$. Let a real-valued expression Φ(x, u ₁, …, u _n) be continuous in (x, u ₁, …, u _n) ∈ (0, 1) × [0, ∞) ×⋯ × [0, ∞). Then the inequality

$$\displaystyle \begin{aligned} \int_0^1\Phi(x,f_1(x),\dots,f_n(x))\,dx\le\int_0^1\Phi(x,f_1^*(x),\dots,f_n^*(x))\,dx \end{aligned} $$

(8.6)

holds for all bounded positive measurable functions f ₁, …, f _n from (0, 1) to $\mathbb {R}$ if and only if the following two conditions hold:

$$\displaystyle \begin{aligned} \Phi(u_i+h,u_j+h)-\Phi(u_i+h,u_j)-\Phi(u_i,u_j+h)+\Phi(u_i,u_j)\ge0 \end{aligned} $$

(8.7)

and

$$\displaystyle \begin{aligned} \int_0^\delta\big[\Phi(x-t,u_i+h)-\Phi(x-t,u_i) -\Phi(x+t,u_i+h)+\Phi(x+t,u_i)\big]\,dt\ge0 \end{aligned} $$

(8.8)

for all h > 0, x ∈ (0, 1), δ ∈ (0, x ∧ (1 − x)), (u ₁, …, u _n) ∈ [0, ∞)ⁿ, and i, j in [n] such that i < j; here, in each of inequalities (8.7) and (8.8), the arguments of Φ that are the same for all the four instances of Φ are omitted, for brevity.

To establish the connection between Lorentz’s result and our Theorem 8.1.1, suppose e.g. that each of the functions f ₁, …, f _n in [16] is a step function, constant on each of the intervals $(\frac {j-1}m,\frac jm]$ for j ∈ [m], and then let $g_j(s):=f_s(\frac jm)$ for j ∈ [m] and s ∈ S := [n]. In fact, in the proof in [16] the result is first established for such step functions f ₁, …, f _n. It is also shown in [16] that, for such “infinite-dimensional” counterparts of the functions given by the “additive” formula (8.5), the sufficient condition is also necessary. In turn, as pointed out in [16], the result there generalizes an inequality in [23]. Another proof of a special case of the result in [16] was given in [5].

8.2 Illustrations and Applications

8.2.1 A General Construction of Generalized n-Submodular Functions from Submodular Ones

Recall here some basics of majorization theory [17]. For x = (x ₁, …, x _n) and y = (y ₁, …, y _n) in $\mathbb {R}^n$, write x ≺ y if x ₁ + ⋯ + x _n = y ₁ + ⋯ + y _n and for all k ∈ [n]. For any $D\subseteq \mathbb {R}^n$, a function $F\colon D\to \mathbb {R}$ is called Schur-concave if for any x and y in D such that x ≺ y one has F(x) ≥ F(y). If D = I ⁿ for some open interval $I\subseteq \mathbb {R}$ and the function F is continuously differentiable then, by Schur’s theorem [17, Theorem A.4], F is Schur-concave iff $(\frac {\partial F}{\partial x_i}-\frac {\partial F}{\partial x_j})(x_i-x_j)\le 0$ for all x = (x ₁, …, x _n) ∈ D.

Proposition 8.2.1

Suppose that a real-valued function λ defined on a distributive lattice L is submodular and nondecreasing, and a function $\mathbb {R}^n\ni x=(x_1,\dots ,x_n)\to F(x_1,\dots ,x_n)$ is nondecreasing in each of its n arguments and Schur-concave. Then the function $\Lambda =\Lambda _{\lambda ,F}\colon L^n\to \mathbb {R}$ defined by the formula

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n):=\Lambda_{\lambda,F}(f_1,\dots,f_n):=F(\lambda(f_1),\dots,\lambda(f_n)) \end{aligned} $$

(8.9)

for (f ₁, …, f _n) ∈ L ⁿ is generalized -submodular and hence generalized n-submodular.

A rather general construction of submodular functions on rings of sets is provided by [6, §23.2], which implies that ∪-homomorpisms preserve the property of being alternating of a given order, and the proposition at the end of [6, §23.1], which describes general ∪-homomorpisms as maps of the form

$$\displaystyle \begin{aligned} S\supseteq A\mapsto G(A):=\{t\in T\colon (s,t)\in G\text{ for some }s\in A\}, \end{aligned}$$

where S and T are sets and G ⊆ S × T; in the case when G is (the graph of) a map, the above notation G(A) is of course consistent with that for the image of a set A under the map G; according to the definition in the beginning of [6, §23], a ∪-homomorpism is a map φ of set rings defined by the condition φ(A ∪ B) = φ(A) ∪ φ(B) for all relevant sets A and B.

Therefore and because an additive function on a ring of sets is modular and hence submodular, we conclude that functions of the form

$$\displaystyle \begin{aligned} A\mapsto \mu(G(A)) \end{aligned} $$

(8.10)

are submodular, where μ is a measure or, more generally, an additive function (say on a discrete set, to avoid matters of measurability). From this observation, one can immediately obtain any number of corollaries of Proposition 8.2.1 such as the following:

Corollary 8.2.2

Let P be a polytope of dimension d. For each $\alpha \in \overline {0,d}$, let $\mathcal {F}_\alpha $ denote the set of all α-faces (that is, faces of dimension α) of P. For any distinct α, β, γ in $\overline {0,d}$, let G = G _α,β,γ be the set of all pairs $\big (f_\alpha ,(f_\beta ,f_\gamma )\big )\in \mathcal {F}_\alpha \times (\mathcal {F}_\beta \times \mathcal {F}_\gamma )$ such that f _α ∩ f _β≠∅, f _α ∩ f _γ≠∅, and f _β ∩ f _γ≠∅. Let L be a lattice of subsets of $\mathcal {F}_\alpha $. Let a function $\mathbb {R}^n\ni x=(x_1,\dots ,x_n)\to F(x_1,\dots ,x_n)$ be nondecreasing in each of its n arguments and Schur-concave. Then the function $\Lambda =\Lambda _{\alpha ,\beta ,\gamma }\colon L^n\to \mathbb {R}$ defined by the formula

$$\displaystyle \begin{aligned} \Lambda(A_1,\dots,A_n):=F(\operatorname{card} G(A_1),\dots,\operatorname{card} G(A_n)) \end{aligned}$$

for (A ₁, …, A _n) ∈ L ⁿ is generalized -submodular and hence generalized n-submodular.

For readers’ convenience, here is a direct verification of the fact that maps of the form (8.10) are submodular: noting that G(A ∪ B) = G(A) ∪ G(B) and G(A ∩ B) ⊆ G(A) ∩ G(B) and using the additivity of μ, we have

$$\displaystyle \begin{aligned} \mu(G(A\cup B))+\mu(G(A\cap B)) \le \mu(G(A)\cup G(B))+\mu(G(A)\cap G(B))=\mu(G(A))+\mu(G(B)) \end{aligned}$$

for all relevant sets A and B.

8.2.2 Generalized One-Sided Potential

Let here L be the lattice of all measurable real-valued functions on a measure space (S, Σ, μ), with the pointwise lattice operations ∨ and ∧. Consider the function $\Lambda \colon L^n\to \mathcal {R}$ given by the formula

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n):=\Lambda_{\varphi,\psi}(f_1,\dots,f_n):=\sum_{j,k=1}^n\Psi(f_j-f_k) \end{aligned} $$

(8.11)

for all f = (f ₁, …, f _n) ∈ L ⁿ, where

$$\displaystyle \begin{aligned} \Psi(g):=\psi\Big(\int_S(\varphi\circ g )\operatorname{d}\mu\Big) \end{aligned} $$

(8.12)

for all g ∈ L, $\varphi \colon \mathbb {R}\to [0,\infty ]$ is a nondecreasing or nonincreasing function, and ψ: [0, ∞] → (−∞, ∞] is a concave function. Thus, the function Λ = Λ_φ,ψ may be referred to as a generalized one-sided potential, since the function φ is assumed to be monotonic.

Proposition 8.2.3

The function Λ = Λ _φ,ψ defined by formula (8.11) is generalized -submodular and hence generalized n-submodular.

8.2.3 Symmetric Sums of Nonnegative Multiadditive Functions

Let k be a natural number. Let L be a sublattice of the lattice $\mathbb {R}^S$ of all real-valued functions on a set S. Let us say that the lattice L is complementable if f ∖ g := f − f ∧ g ∈ L for any f and g in L, so that f = f ∧ g + f ∖ g. Assuming that L is complementable, let us say that a function $m\colon L\to \mathbb {R}$ is additive if

$$\displaystyle \begin{aligned} m(f)=m(f\wedge g)+m(f\setminus g) \end{aligned}$$

for all f and g in L; further, let us say that a function $m\colon L^k\to \mathbb {R}$ is multiadditive or, more specifically, k-additive if m is additive in each of its k arguments, that is, if for each j ∈ [k] and each (k − 1)-tuple (f _i: i ∈ [k] ∖{j}) the function L ∋ f _j↦m(f ₁, …, f _k) is additive.

To state the main result of this subsection, we shall need the following notation: for any set J, let $\Pi _k^J$ denote the set of all k-permutations of J, that is, the set of all injective maps of the set [k] to J.

Proposition 8.2.4

Suppose that k and n are natural numbers such that k ≤ n, L is a complementable sublattice of $\mathbb {R}^S$, and $m\colon L^k\to \mathbb {R}$ is a nonnegative multiadditive function. Then the function $\Lambda _m\colon L^n\to \mathbb {R}$ defined by the formula

$$\displaystyle \begin{aligned} \Lambda_m(f_1,\dots,f_n):=\sum_{\pi\in\Pi_k^{[n]}}m(f_{\pi(1)},\dots,f_{\pi(k)}) \end{aligned} $$

(8.13)

for (f ₁, …, f _n) ∈ L ⁿ is generalized -submodular and hence generalized n-submodular.

Formula (8.13) can be rewritten in the following symmetrized form:

$$\displaystyle \begin{aligned} \Lambda_m(f_1,\dots,f_n)=k!\,\sum_{I\in\binom{[n]}k}\overline{m}(f_I), \end{aligned} $$

(8.14)

where, for I = {i ₁, …, i _k} with 1 ≤ i ₁ < ⋯ < i _k ≤ n,

$$\displaystyle \begin{aligned} \overline{m}(f_I):=\overline{m}(f_{i_1},\dots,f_{i_k}):=\frac 1{k!}\,\sum_{\pi\in\Pi_k^I}m(f_{\pi(1)},\dots,f_{\pi(k)}); \end{aligned} $$

(8.15)

note that the so-defined function $\overline {m}\colon L^k\to \mathbb {R}$ is multiadditive and nonnegative, given that m is so. Also, $\overline {m}$ is permutation-symmetric in the sense that $\overline {m}(f_{\pi (1)},\dots ,f_{\pi (k)})=\overline {m}(f_1,\dots ,f_k)$ for all (f ₁, …, f _k) ∈ L ^k and all permutations $\pi \in \Pi _k^{[k]}$.

Example 8.2.5

If V is a vector sublattice of the lattice $\mathbb {R}^S$ and L is the lattice of all nonnegative functions in V then, clearly, L is complementable and the restriction to L ^k of any multilinear function from V ^k to $\mathbb {R}$ is multiadditive.

In particular, if μ is a measure on a σ-algebra Σ over S, V is a vector sublattice of L ^k(S, Σ, μ), and L is the lattice of all nonnegative functions in V , then the function $m\colon L^k\to \mathbb {R}$ given by the formula

$$\displaystyle \begin{aligned}m(f_1,\dots,f_k):=\int_S f_1\cdots f_k \operatorname{d}\mu\end{aligned}$$

for (f ₁, …, f _k) ∈ L ^k is multiadditive.

So, by Proposition 8.2.4, the functions Λ_m corresponding to the functions m presented above in this example are generalized -submodular and hence generalized n-submodular.

Let now B = (b _i,j) be a d × p matrix with d ≤ p and nonnegative entries b _i,j. The permanent of B is defined by the formula

$$\displaystyle \begin{aligned} \operatorname{perm} B:=\sum_{J\in\binom{[p]}d}\operatorname{perm} B_{\cdot J}, \end{aligned}$$

where B _⋅J the square submatrix of B consisting of the columns of B with column indices in the set $J\in \binom {[p]}d$; and for a square d × d matrix B = (b _i,j),

$$\displaystyle \begin{aligned} \operatorname{perm} B :=\sum_{\pi\in\Pi_d^{[d]}}b_{1,\pi(1)}\cdots b_{d,\pi(d)}. \end{aligned}$$

So, $\operatorname {perm} B$ is a multilinear function of the d-tuple (b _1,⋅, …, b _d,⋅) of the rows of B. Also, if d = p, then $\operatorname {perm} B$ is a multilinear function of the d-tuple (b _⋅,1, …, b _⋅,d) of the columns of B. If d ≥ p, then $\operatorname {perm} B$ may be defined by the requirement that the permanent be invariant with respect to transposition.

Thus, from Proposition 8.2.4 we immediately obtain

Corollary 8.2.6

Assuming that the entries b _i,j of the d × p matrix B are nonnegative, $\operatorname {perm} B$ is a generalized d-submodular function of the d-tuple (b _1,⋅, …, b _d,⋅) of its rows and a generalized p-submodular function of the p-tuple (b _⋅,1, …, b _⋅,p) of its columns (with respect to the standard lattice structures on $\mathbb {R}^{1\times p}$ and $\mathbb {R}^{d\times 1}$, respectively):

Note that the condition d ≤ p is not needed or assumed in Corollary 8.2.6.

Yet another way in which multilinear and hence multiadditive functions may arise is via the elementary symmetric polynomials. Let n be any natural number, and let k ∈ [n]. The elementary symmetric polynomials are defined by the formula

$$\displaystyle \begin{aligned} e_k(x_1,\dots,x_n):=\sum_{J\in\binom{[n]}k}\prod_{j\in J}x_j. \end{aligned}$$

In particular, e ₁(x ₁, …, x _n) :=∑_{j ∈ [n]}x _j and e _n(x ₁, …, x _n) :=∏_{j ∈ [n]}x _j.

Let f = (f ₁, …, f _n) be the vector of measurable functions f ₁, …, f _n defined on a measure space (S, Σ, μ) with values in the interval [0, ∞). Then it is not hard to see that the “order statistics” are nonnegative measurable functions as well. As usual, let $\mu (h):=\int _S h\,\operatorname {d}\mu $.

If the measure μ is a probability measure, then the functions f ₁, …, f _n are called random variables (r.v.’s) and, in this case, f _n:1, …, f _n:n will indeed be what is commonly referred to as the order statistics based on the “random sample” f = (f ₁, …, f _n); cf. e.g. [7]. In contrast with settings common in statistics, in general we do not impose any conditions on the joint or individual distributions of the r.v.’s f ₁, …, f _n—except that these r.v.’s be nonnegative.

Then we have the following.

Corollary 8.2.7

$$\displaystyle \begin{aligned} e_k\big(\mu(f_1),\dots,\mu(f_n)\big)\ge e_k\big(\mu(f_{n:1}),\dots,\mu(f_{n:n})\big). \end{aligned} $$

(8.16)

In particular,

$$\displaystyle \begin{aligned} \mu(f_1)\cdots\mu(f_n)\ge\mu(f_{n:1})\cdots\mu(f_{n:n}). \end{aligned} $$

(8.17)

This follows immediately from Proposition 8.2.4 and formula (8.14), since the product μ(f ₁)⋯μ(f _k) is clearly multilinear and hence multiadditive in (f ₁, …, f _k).

To deal with cases when some of the μ(f _j)’s (or the μ(f _n:j)’s) equal 0 and other ones equal ∞, let us assume here the convention 0 ⋅∞ := 0. One may note that, if the nonnegative functions f ₁, …, f _n are scalar multiples of one another or, more generally, if f _π(1) ≤⋯ ≤ f _π(n) for some permutation π of the set [n], then inequality (8.16) turns into the equality.

As mentioned above, in Corollary 8.2.7 it is not assumed that f ₁, …, f _n are independent r.v.’s. However, if μ is a probability measure and the r.v.’s f ₁, …, f _n are independent (but not necessarily identically distributed), then μ(f ₁)⋯μ(f _n) = μ(f ₁⋯f _n) = μ(f _n:1⋯f _n:n) by the second part of (8.3), and so, (8.17) can then be rewritten as the following positive-association-type inequality for the order statistics:

$$\displaystyle \begin{aligned} \mu(f_{n:1}\cdots f_{n:n})\ge\mu(f_{n:1})\cdots\mu(f_{n:n}). \end{aligned} $$

(8.18)

Let now ψ be any monotone (that is, either nondecreasing or nonincreasing) function from [0, ∞] to [0, ∞]. For f = (f ₁, …, f _n) as before, let

$$\displaystyle \begin{aligned}\psi\bullet f:=(\psi\circ f_1,\dots,\psi\circ f_n).\end{aligned}$$

Then for j ∈ [n] one has (ψ•f)_n:j = ψ ∘ f _n:j if ψ is nondecreasing and (ψ•f)_n:j = ψ ∘ f _n:n+1−j if ψ is nonincreasing. Thus, we have the following ostensibly more general forms of (8.17) and (8.18):

Corollary 8.2.8

$$\displaystyle \begin{aligned} \mu(\psi\circ f_1)\cdots\mu(\psi\circ f_n)\ge\mu\big((\psi\bullet f)_{n:1})\cdots\mu((\psi\bullet f)_{n:n}\big). \end{aligned} $$

(8.19)

If μ is a probability measure and the r.v.’s f ₁, …, f _n are independent, then

$$\displaystyle \begin{aligned} \mu\big((\psi\bullet f)_{n:1}\cdots (\psi\bullet f)_{n:n}\big) \ge\mu\big((\psi\bullet f)_{n:1}\big)\cdots\mu\big((\psi\bullet f)_{n:n}\big). \end{aligned} $$

(8.20)

The property of the order statistics f _n:1, ⋯ , f _n:n given by inequality (8.20) may be called the diagonal positive orthant dependence—cf. e.g. Definition 2.3 in [12] of the negative orthant dependence.

Immediately from Theorem 8.1.1 or from inequality (8.19) in Corollary 8.2.8, one obtains

Corollary 8.2.9

Take any $p\in \mathbb {R}\setminus \{0\}$ . Then

$$\displaystyle \begin{aligned} \mu(f_1^p)^r\cdots\mu(f_n^p)^r\ge\mu(f_{n:1}^p)^r\cdots\mu(f_{n:n}^p)^r \end{aligned} $$

(8.21)

for any r ∈ (0, ∞), and

$$\displaystyle \begin{aligned} \mu(f_1^p)^r\cdots\mu(f_n^p)^r\le\mu(f_{n:1}^p)^r\cdots\mu(f_{n:n}^p)^r \end{aligned} $$

(8.22)

for any r ∈ (−∞, 0). Here we use the conventions 0^t := ∞ and ∞ ^t := 0 for t ∈ (−∞, 0). We also the following conventions: 0 ⋅∞ := 0 concerning (8.21) and 0 ⋅∞ := ∞ concerning (8.22).

Consider now the special case of Corollary 8.2.9 with r = 1∕p. Letting then p →∞, we see that (8.21) will hold with the $\mu (f_j^p)^r$’s and $\mu (f_{n:j}^p)^r$’s replaced there by and , respectively, where denotes the essential supremum with respect to measure μ. This follows because . Similarly, letting p →−∞, we see that (8.22) will hold with the $\mu (f_j^p)^r$’s and $\mu (f_{n:j}^p)^r$’s replaced there by and , respectively, where denotes the essential infimum with respect to μ. Moreover, considering (say) the counting measures μ on finite subsets of the set S and noting that $\sup h=\sup _S h$ coincides with the limit of the net (max_Jh) over the filter of all finite subsets J of S, we conclude that (8.21) will hold with the $\mu (f_j^p)^r$’s and $\mu (f_{n:j}^p)^r$’s replaced there by $\sup f_j$ and $\sup f_{n:j}$, respectively. (The statement about the limit can be spelled out as follows: sup_Sh ≥max_Jh for all finite J ⊆ S, and for each real c such that $c<\sup h$ there is some finite set J _c ⊆ S such that for all finite sets J such that J _c ⊆ J ⊆ S one has max_Jh > c.) Similarly, (8.22) will hold with the $\mu (f_j^p)^r$’s and $\mu (f_{n:j}^p)^r$’s replaced there by $\inf f_j$ and $\inf f_{n:j}$, respectively. Thus, we have

Corollary 8.2.10

$$\displaystyle \begin{aligned} (\sup f_1)\cdots(\sup f_n)\ge(\sup f_{n:1})\cdots(\sup f_{n:n}) \end{aligned} $$

(8.23)

and

$$\displaystyle \begin{aligned} (\inf f_1)\cdots(\inf f_n)\le(\inf f_{n:1})\cdots(\inf f_{n:n}). \end{aligned} $$

(8.24)

Here we use the following conventions: 0 ⋅∞ := 0 concerning (8.23) and 0 ⋅∞ := ∞ concerning (8.24).

Alternatively, one can obtain (8.23) and (8.24) directly from Theorem 8.1.1.

Also, of course there is no need to assume in Corollary 8.2.10 that the functions f ₁, …, f _n are measurable.

The special cases of inequalities (8.22) and (8.24) for n = 2 mean that the functions h↦μ(h ^p)^r and $h\mapsto \inf h$ are log-supermodular functions on the distributive lattice (say $\mathcal {L}_{\Sigma }$) of all nonnegative Σ-measurable functions on S and on the distributive lattice (say $\mathcal {L}$) of all nonnegative functions on S, respectively.

At this point, let us recall the famous Fortuin–Kasteleyn–Ginibre (FKG) correlation inequality [8], which states that for any log-supermodular function ν on a finite distributive lattice L and any nondecreasing functions F and G on L we have

$$\displaystyle \begin{aligned} \nu(FG)\nu(1)\ge\nu(F)\nu(G), \end{aligned}$$

where ν(F) :=∑_{f ∈ L}ν(f).

Then we immediately obtain

Corollary 8.2.11

Let $\mathcal {L}^\circ _{\Sigma }$ be any finite sub-lattice of the lattice $\mathcal {L}_{\Sigma }$ , and let F and G be nondecreasing functions from $\mathcal {L}^\circ _{\Sigma }$ to $\mathbb {R}$ . Then

$$\displaystyle \begin{aligned} \Big(\sum_{h\in\mathcal{L}^\circ_{\Sigma}}F(h)G(h)\mu(h)^r\Big)\Big(\sum_{h\in\mathcal{L}^\circ_{\Sigma}}\mu(h)^r\Big) \ge\Big(\sum_{h\in\mathcal{L}^\circ_{\Sigma}}F(h)\mu(h)^r\Big)\Big(\sum_{h\in\mathcal{L}^\circ_\Sigma}G(h)\mu(h)^r\Big) \end{aligned}$$

for any r ∈ (−∞, 0). Similarly, let $\mathcal {L}^\circ $ be any finite sub-lattice of the lattice $\mathcal {L}$, and let F and G be nondecreasing functions from $\mathcal {L}^\circ $ to $\mathbb {R}$. Then

$$\displaystyle \begin{aligned} \Big(\sum_{h\in\mathcal{L}^\circ_\Sigma}F(h)G(h)\inf h\Big)\Big(\sum_{h\in\mathcal{L}^\circ_\Sigma}\inf h\Big) \ge\Big(\sum_{h\in\mathcal{L}^\circ_\Sigma}F(h)\inf h\Big)\Big(\sum_{h\in\mathcal{L}^\circ_\Sigma}G(h)\inf h\Big). \end{aligned}$$

As shown by Ahlswede and Daykin [2, pp. 288–289], their inequality [2, Theorem 1] almost immediately implies, and is in a sense sharper than, the FKG inequality. Furthermore, Rinott and Saks [21, 22] and Aharoni and Keich [1] independently obtained a more general inequality “for n-tuples of nonnegative functions on a distributive lattice, of which the Ahlswede–Daykin inequality is the case n = 2.” More specifically, in notation closer to that used in the present paper, [1, Theorem 1.1] states the following:

Let α ₁, …, α _n, β ₁, …, β _n be nonnegative functions defined on a distributive lattice L such that

$$\displaystyle \begin{aligned} \prod_{j=1}^n\alpha_j(f_j)\le\prod_{j=1}^n\beta_j(f_{n:j}) \end{aligned}$$

for all f ₁, …, f _n in L. Then for any finite subsets F ₁, …, F _n of L

$$\displaystyle \begin{aligned} \prod_{j=1}^n\sum_{f_j\in F_j}\alpha_j(f_j)\le\prod_{j=1}^n\sum_{g_j\in F_{n:j}}\beta_j(g_j), \end{aligned}$$

where

$$\displaystyle \begin{aligned} F_{n:j}:=\{f_{n:j}\colon f=(f_1,\dots,f_n)\in F_1\times\dots\times F_n\}. \end{aligned}$$

Note that the definition of the “order statistics” used in [1] is different from (8.2) in that their “order statistics” go in the descending, rather than ascending, order; also, the term “order statistics” is not used in [1].

In view of this result of [1] and our Corollaries 8.2.9 and 8.2.10, one immediately obtains the following statement, which generalizes and strengthens Corollary 8.2.11:

Corollary 8.2.12

Let $\mathcal {F}_1,\dots ,\mathcal {F}_n$ be any finite subsets of the lattice $\mathcal {L}_\Sigma $. For each j ∈ [n], let

$$\displaystyle \begin{aligned} \mathcal{F}_{n:j}:=\{f_{n:j}\colon f=(f_1,\dots,f_n)\in\mathcal{F}_1\times\dots\times\mathcal{F}_n\}. \end{aligned}$$

Then

$$\displaystyle \begin{aligned} \prod_{j=1}^n\sum_{f_j\in\mathcal{F}_j}\mu(f_j)^r\le\prod_{j=1}^n\sum_{h_j\in\mathcal{F}_{n:j}}\mu(h_j)^r \end{aligned} $$

(8.25)

for any r ∈ (−∞, 0).

Similarly, let now $\mathcal {F}_1,\dots ,\mathcal {F}_n$ be any finite subsets of the lattice $\mathcal {L}$ . Then

$$\displaystyle \begin{aligned} \prod_{j=1}^n\sum_{f_j\in\mathcal{F}_j}\inf f_j\le\prod_{j=1}^n\sum_{h_j\in\mathcal{F}_{n:j}}\inf h_j. \end{aligned}$$

Comparing inequalities (8.21) and (8.22) in Corollary 8.2.9 or inequalities (8.23) and (8.24) in Corollary 8.2.10, one may wonder whether the FKG-type inequalities stated in Corollaries 8.2.11 and 8.2.12 for the functions h↦μ(h)^r with r < 0 and $h\mapsto \inf h$ admit of the corresponding reverse analogues for the functions h↦μ(h)^r with r > 0 and $h\mapsto \sup h$. However, it is not hard to see that such FKG-type inequalities are not reversible in this sense, a reason being that the sets $\mathcal {F}_{n:j}$ may be much larger than the sets $\mathcal {F}_j$.

E.g., suppose that n = 2, $S=\mathbb {R}$, μ is a Borel probability measure on $\mathbb {R}$, 0 < ε < δ < 1, N is a natural number, $\mathcal {F}_1$ is the set of N pairwise distinct constant functions f ₁, …, f _N on $\mathbb {R}$ such that 1 − ε < f _j < 1 + ε for all j ∈ [n], and $\mathcal {F}_2=\{g_1,\dots ,g_N\}$, where g _j := (1 − δ)1 _(−∞,j] + (1 + δ)1 _(j,∞) and 1 _A denotes the indicator of a set A. Then it is easy to see that each of the sets $\mathcal {F}_{2:1}$ and $\mathcal {F}_{2:2}$ is of cardinality N ². So, letting δ ↓ 0 (so that ε ↓ 0 as well), we see that, for any real r, the right-hand side of (8.25) goes to N ⁴ whereas its left-hand side goes to N ², which is much less than N ⁴ if N is large.

Example 8.2.13

Closely related to Example 8.2.5 is as follows. Suppose that (S, Σ) is a measurable space, μ is a measure on the product σ-algebra Σ^⊗k, and L is a subring of Σ. Then L is complementable and the function $m\colon L^k\to \mathbb {R}$ given by the formula

$$\displaystyle \begin{aligned} m(A_1,\dots,A_k):=\mu(A_1\times\dots\times A_k) \end{aligned} $$

(8.26)

for (A ₁, …, A _k) ∈ L ^k is multiadditive.

A particular case of formula (8.26) is

$$\displaystyle \begin{aligned} m(A_1,\dots,A_k):=\operatorname{card}\big(G\cap(A_1\times\dots\times A_k)\big), \end{aligned} $$

(8.27)

where $\operatorname {card}$ stands for the cardinality and G is an arbitrary subset of S ^k. If G is symmetric in the sense that (s ₁, …, s _k) ∈ G iff (s _π(1), …, s _π(k)) ∈ G for all permutations π of the set [k], then G represents the set (say E) of all hyperedges of a k-uniform hypergraph over S, in the sense that (s ₁, …, s _k) ∈ G iff {s ₁, …, s _k}∈ E.

We now have another immediate corollary of Proposition 8.2.4:

Corollary 8.2.14

Suppose that k and n are natural numbers such that k ≤ n, (S, Σ) is a measurable space, μ is a measure on the product σ-algebra Σ ^⊗k, and L is a subring of Σ. Then

(8.28)

for all (A ₁, …, A _n) ∈ L ⁿ.

8.3 Proofs

One may note that formula (8.31) in the proof of Theorem 8.1.1 below defines a step similar to a step in the process of the so-called insertion search (cf. e.g. [15, Section 5.2.1] (also called the sifting or sinking technique)—except that here we do the pointwise comparison of functions (rather than numbers) and therefore we do not stop when the right place of the value f _n+1(s) of the “new” function f _n+1 among the already ordered values f _n:1(s), …, f _n:n(s) at a particular point s ∈ S has been found, because this place will in general depend on s. So, the proof that (8.31) implies (8.34) may be considered as (something a bit more than) a rigorous proof of the validity of the insertion search algorithm, avoiding such informal, undefined terms as swap, moving, and interleaving.

Proof of Theorem 8.1.1

Let us prove the theorem by induction in n. For n = 1, the result is trivial. To make the induction step, it suffices to prove the following: For any natural n ≥ 2, if the function $\Lambda \colon L^n\to \mathcal {R}$ is generalized -semimodular and the function L ⁿ⁻¹ ∋ (f ₁, …, f _n−1)↦ Λ(f ₁, …, f _n) is generalized (n − 1)-semimodular for each f _n ∈ L, then Λ is generalized n-semimodular. Thus, we are assuming that the function $\Lambda \colon L^n\to \mathcal {R}$ is generalized -semimodular and

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n)\mathbin{\Join}\Lambda(f_{n-1:1},\dots,f_{n-1:n-1},f_n) \end{aligned} $$

(8.29)

for all (f ₁, …, f _n) ∈ L ⁿ, where f _n−1:1, …, f _n−1:n−1 are the “order statistics” based on (f ₁, …, f _n−1).

Take indeed any (f ₁, …, f _n) ∈ L ⁿ. Define the rectangular array of functions $(g_{k,j}\colon k\in \overline {0,{n-1}}, j\in [n])$ recursively, as follows:

$$\displaystyle \begin{aligned} (g_{0,1},\dots,g_{0,n-1},g_{0,n}):=(f_{n-1:1},\dots,f_{n-1:n-1},f_n) \end{aligned} $$

(8.30)

and, for $k\in \overline {1,{n-1}}$ and j ∈ [n],

$$\displaystyle \begin{aligned} g_{k,j}:= \left\{ \begin{alignedat}{2} & g_{k-1,j} && \ \ \text{if}\ \ j\in\overline{1,{n-k-1}}\,\cup\,\overline{{n-k+2},n}, \\ & g_{k-1,n-k}\wedge g_{k-1,n-k+1} && \ \ \text{if}\ \ j=n-k, \\ & g_{k-1,n-k}\vee g_{k-1,n-k+1} && \ \ \text{if}\ \ j=n-k+1. \end{alignedat} \right. \end{aligned} $$

(8.31)

By (8.29) and (8.30),

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n)\mathbin{\Join}\Lambda(g_{0,1},\dots,g_{0,n-1},g_{0,n}). \end{aligned} $$

(8.32)

Moreover, for each $k\in \overline {1,{n-1}}$,

$$\displaystyle \begin{aligned} \Lambda(g_{k-1,1},\dots,g_{k-1,n}) \mathbin{\Join}\Lambda(g_{k,1},\dots,g_{k,n}), \end{aligned} $$

(8.33)

since Λ is generalized -semimodular.

It follows from (8.32) and (8.33) that

$$\displaystyle \begin{aligned} \Lambda(f_1,\dots,f_n)\mathbin{\Join} \Lambda(g_{n-1,1},\dots,g_{n-1,n}). \end{aligned}$$

It remains to verify the identity

$$\displaystyle \begin{aligned} (g_{n-1,1},\dots,g_{n-1,n})\overset{\text{(?)}}=(f_{n:1},\dots,f_{n:n}). \end{aligned} $$

(8.34)

In accordance with Remark 8.1.3, we may and shall assume that the distributive lattice L is a lattice of nonnegative real-valued functions on a set S, so that (8.3) holds for each s ∈ S.

In the remainder of the proof, fix any s ∈ S. Then

$$\displaystyle \begin{aligned} \{\{g_{0,1}(s),\dots,g_{0,n}(s)\}\}=\{\{f_1(s),\dots,f_{n}(s)\}\}, \end{aligned}$$

by (8.30) and the second part of (8.3) used with n − 1 in place of n; also, for each $k\in \overline {1,{n-1}}$,

$$\displaystyle \begin{aligned} \{\{g_{k,1}(s),\dots,g_{k,n}(s)\}=\{\{g_{k-1,1}(s),\dots,g_{k-1,n}(s)\}\}, \end{aligned}$$

by (8.31). So,

$$\displaystyle \begin{aligned} \{\{g_{n-1,1}(s),\dots,g_{n-1,n}(s)\}\}=\{\{f_1(s),\dots,f_n(s)\}\}. \end{aligned}$$

Therefore, to complete the proof of (8.34) and thus that of Theorem 8.1.1, it remains to show that

$$\displaystyle \begin{aligned} g_{n-1,1}(s)\overset{\text{(?)}}\le\cdots\overset{\text{(?)}}\le g_{n-1,n}(s), \end{aligned} $$

(8.35)

which will follow immediately from

Lemma 8.3.1

For each $k\in \overline {1,{n-1}}$, the following assertion is true for all s ∈ S:

$$\displaystyle \begin{aligned} \begin{gathered} g_{k,j}(s)\le g_{k,j+1}(s)\mathit{\text{ for all }}j\in\overline{1,{n-k-2}}\,\cup\,\overline{{n-k},{n-1}}; \\ \mathit{\text{also, }}g_{k,n-k-1}(s)\le g_{k,n-k+1}(s)\mathit{\text{ if }}k\le n-2. \end{gathered} \end{aligned}$$

(Ak )

Indeed, (8.35) is the first clause in assertion (A _k) with k = n − 1. Thus, what finally remains to prove Theorem 8.1.1 is to present the following.

For simplicity, let us be dropping (s)—thus writing g _k,j, f _n, … in place of g _k,j(s), f _n(s), …. We shall prove Lemma 8.3.1 by induction in $k\in \overline {1,{n-1}}$. Assertion (A ₁) means that g _1,1 ≤⋯ ≤ g _1,n−2, g _1,n−1 ≤ g _1,n, and g _1,n−2 ≤ g _1,n if 1 ≤ n − 2. So, in view of (8.31) and (8.30), (A ₁) can be rewritten as follows: f _n−1:1 ≤⋯ ≤ f _n−1:n−2, f _n−1:n−1 ∧ f _n ≤ f _n−1:n−1 ∨ f _n, and f _n−1:n−2 ≤ f _n−1:n−1 ∨ f _n; all these inequalities are obvious. So, (A ₁) holds.

Take now any $k\in \overline {2,{n-1}}$ and suppose that (A _k−1) holds. We need to show that then (A _k) holds.

For all $j\in \overline {1,{n-k-2}}\,\cup \,\overline {{n-k+2},{n-1}}$, we have $j+1\in \overline {1,{n-k-1}} \cup \,\overline {{n-k+2},n}$, whence, by (8.31) and the first clause of (A _k−1), g _k,j = g _k−1,j ≤ g _k−1,j+1 = g _k,j+1. So,

$$\displaystyle \begin{aligned} g_{k,j}\le g_{k,j+1}\quad \text{for }j\in\overline{1,{n-k-2}}\,\cup\,\overline{{n-k+2},{n-1}}. \end{aligned} $$

(8.36)

If j = n − k then, by (8.31), g _k,j = g _k−1,n−k ∧ g _{k−1,n−k+1} ≤ g _k−1,n−k ∨ g _{k−1,n−k+1} = g _k,j+1.

If j = n − k + 1 then the condition $k\in \overline {2,{n-1}}$ implies j ≤ n − 1, and so, by (8.31) and the second and first clauses of (A _k−1), g _k,j = g _k−1,n−k ∨ g _{k−1,n−k+1} ≤ g _{k−1,n−k+2} = g _k−1,j+1 = g _k,j+1.

Thus, in view of (8.36), the first clause of (A _k) holds. Also, if k ≤ n − 2 then, by (8.31) and the first clause of (A _k−1), g _k,n−k−1 = g _{k−1,n−k−1} ≤ g _k−1,n−k ≤ g _k−1,n−k ∨ g _{k−1,n−k+1} = g _k,n−k+1, so that the second clause of (A _k) holds as well. This completes the proof of Lemma 8.3.1.

Thus, Theorem 8.1.1 is proved. □

Proof of Proposition 8.2.1

Take any (f ₁, …, f _n) ∈ L ⁿ. Corollary B.3 in [17] states that x ≺ y iff x is in the convex hull of the set of all points obtained by permuting the coordinates of the vector y. Also, since the function λ is nondecreasing, we have λ(f ₁ ∨ f ₂) ≥ λ(f ₁) ∨ λ(f ₂). For any real a, b, c such that c ≥ a ∨ b, we have (a, b) = (1 − t)(a + b − c, c) + t(c, a + b − c) for $t=\frac {c-b}{2c-a-b}\in [0,1]$ if c > (a + b)∕2 and for any t ∈ [0, 1] otherwise (that is, if a = b = c). So, the point (a, b) is a convex combination of points (a + b − c, c) and (c, a + b − c). Using this fact for a = λ(f ₁), b = λ(f ₂), c = λ(f ₁ ∨ f ₂), we see that

$$\displaystyle \begin{aligned} (\lambda(f_1),\dots,\lambda(f_n)) \prec(\lambda(f_1)+\lambda(f_2)-\lambda(f_1\vee f_2),\lambda(f_1\vee f_2),\lambda(f_3),\dots,\lambda(f_n)). \end{aligned}$$

Also, λ(f ₁ ∧ f ₂) ≤ λ(f ₁) + λ(f ₂) − λ(f ₁ ∨ f ₂), by the submodularity of λ. Therefore and because F is nondecreasing (in each of its n arguments) and Schur-concave, we conclude that

$$\displaystyle \begin{aligned}\displaystyle F(\lambda(f_1\wedge f_2),\lambda(f_1\vee f_2),\lambda(f_3),\dots,\lambda(f_n)) \\\displaystyle \le F(\lambda(f_1)+\lambda(f_2)-\lambda(f_1\vee f_2),\lambda(f_1\vee f_2),\lambda(f_3),\dots,\lambda(f_n)) \\ \le F(\lambda(f_1),\dots,\lambda(f_n)). \end{aligned} $$

Quite similarly,

$$\displaystyle \begin{aligned}\displaystyle F(\lambda(f_1),\dots,\lambda(f_{i-1}),\lambda(f_i\wedge f_{i+1}),\lambda(f_i\vee f_{i+1}),\lambda(f_{i+2}),\dots,\lambda(f_n)) \\\displaystyle \le F(\lambda(f_1),\dots,\lambda(f_n)) \end{aligned} $$

for all $i\in \overline {1,{n-1}}$, so that the function F is indeed generalized -submodular and hence, by Theorem 8.1.1, generalized n-submodular. □

Proof of Proposition 8.2.3

In view of Theorem 8.1.1, it is enough to show that the function Λ = Λ_φ,ψ is generalized -submodular. Without loss of generality (w.l.o.g.), we may and shall assume that the function φ is nondecreasing, since $\Lambda _{\varphi ^-,\psi }=\Lambda _{\varphi ,\psi }$, where φ ⁻(u) := φ(−u) for all real u. Also, w.l.o.g. ψ(0) = 0 and hence Ψ(0) = 0.

Take any f = (f ₁, …, f _n) ∈ L ⁿ. Then, letting

$$\displaystyle \begin{aligned} \tilde\Psi(g):=\Psi(g)+\Psi(-g) \end{aligned} $$

(8.37)

for g ∈ L, one has

$$\displaystyle \begin{aligned} \Lambda(f_1,f_2,f_3,\dots,f_n) =\tilde\Psi(f_1-f_2)+\sum_{j=3}^n\big(\tilde\Psi(g_j)+\tilde\Psi(h_j)\big)+R, {} \end{aligned} $$

(8.38)

where g _j := f ₁ − f _j, h _j := f ₂ − f _j, and $R:=\sum _{3\le j<k\le n}^n\tilde \Psi (f_j-f_k)$. Since f ₁ ∧ f ₂ − f ₁ ∨ f ₂ = −|f ₁ − f ₂|, one similarly has

$$\displaystyle \begin{aligned} \begin{aligned} \Lambda(f_1\wedge f_2,f_1\vee f_2,f_3,\dots,f_n) =& \tilde\Psi(|f_1-f_2|) +\sum_{j=3}^n\big(\tilde\Psi(g_j\wedge h_j)+\tilde\Psi(g_j\vee h_j)\big)+R. \end{aligned} \end{aligned} $$

(8.39)

Next,

$$\displaystyle \begin{aligned} \tilde\Psi(f_1-f_2)&=\psi\Big(\int_S\varphi\circ(f_1-f_2)\operatorname{d}\mu\Big) +\psi\Big(\int_S\varphi\circ(f_2-f_1)\operatorname{d}\mu\Big), {} \end{aligned} $$

(8.40)

$$\displaystyle \begin{aligned} \tilde\Psi(|f_1-f_2|)&=\psi\Big(\int_S\varphi\circ|f_1-f_2|\operatorname{d}\mu\Big) +\psi\Big(\int_S\varphi\circ(-|f_2-f_1|)\operatorname{d}\mu\Big), {} \end{aligned} $$

(8.41)

φ ∘ (f ₁ − f ₂) + φ ∘ (f ₂ − f ₁) = φ ∘|f ₁ − f ₂| + φ ∘ (−|f ₁ − f ₂|) and hence

$$\displaystyle \begin{aligned} \int_S\varphi\circ(f_1-f_2)\operatorname{d}\mu+\int_S\varphi\circ(f_2-f_1)\operatorname{d}\mu{=} \int_S\varphi\circ|f_1-f_2|\operatorname{d}\mu+\int_S\varphi\circ(-|f_2-f_1|)\operatorname{d}\mu. \end{aligned} $$

(8.42)

Also, since φ is nondecreasing, φ ∘ (f ₁ − f ₂) ∨ φ ∘ (f ₂ − f ₁) ≤ φ ∘|f ₁ − f ₂| and hence

$$\displaystyle \begin{aligned} \int_S\varphi\circ(f_1-f_2)\operatorname{d}\mu\;\vee\;\int_S\varphi\circ(f_2-f_1)\operatorname{d}\mu\le \int_S\varphi\circ|f_1-f_2|\operatorname{d}\mu. \end{aligned} $$

(8.43)

Since the function ψ is convex, it follows from (8.40)–(8.43) that

$$\displaystyle \begin{aligned} \tilde\Psi(f_1-f_2)\le\tilde\Psi(|f_1-f_2|). \end{aligned} $$

(8.44)

Further, take any $j\in \overline {3,n}$. Then φ ∘ g _j + φ ∘ h _j = φ ∘ (g _j ∧ h _j) + φ ∘ (g _j ∨ h _j). So,

$$\displaystyle \begin{aligned} \int_S(\varphi\circ g_j)\operatorname{d}\mu+\int_S(\varphi\circ h_j)\operatorname{d}\mu =\int_S\varphi\circ(g_j\wedge h_j)\operatorname{d}\mu+\int_S\varphi\circ(g_j\vee h_j)\operatorname{d}\mu. \end{aligned}$$

Moreover, since φ is nondecreasing, $\int _S\varphi \circ (g_j\vee h_j)\operatorname {d}\mu $ is no less than each of the integrals $\int _S(\varphi \circ g_j)\operatorname {d}\mu $ and $\int _S(\varphi \circ h_j)\operatorname {d}\mu $. So, in view of (8.12) and the convexity of the function ψ, one has Ψ(g _j) + Ψ(h _j) ≤ Ψ(g _j ∧ h _j) + Ψ(g _j ∨ h _j). Similarly, because $\int _S\varphi \circ (-(g_j\wedge h_j))\operatorname {d}\mu $ is no less than each of the integrals $\int _S\varphi \circ (-g_j)\operatorname {d}\mu $ and $\int _S\varphi \circ (-h_j)\operatorname {d}\mu $, one has Ψ(−g _j) + Ψ(−h _j) ≤ Ψ(−(g _j ∧ h _j)) + Ψ(−(g _j ∨ h _j)). So, by (8.37), $\tilde \Psi (g_j)+\tilde \Psi (h_j)\le \tilde \Psi (g_j\wedge h_j)+\tilde \Psi (g_j\vee h_j)$.

Therefore, by (8.38), (8.39), and (8.44), Λ(f ₁, f ₂, f ₃, …, f _n) ≤ Λ(f ₁ ∧ f ₂, f ₁ ∨ f ₂, f ₃, …, f _n). Similarly, Λ(f ₁, …, f _j−1, f _j, f _j+1, f _j+2, …, f _n) ≤ Λ(f ₁, …, f _j−1, f _j ∧f _j+1, f _j ∨f _j+1, f _j+2, …, f _n)for all $j\in \overline {1,{n-1}}$.

Thus, the function Λ is generalized -supermodular, and so, by Theorem 8.1.1, it is generalized n-supermodular. □

Proof of Proposition 8.2.4

Fix any (f ₁, …, f _n) ∈ L ⁿ. Then, in view of the permutation symmetry of $\overline {m}$ defined by (8.15),

$$\displaystyle \begin{aligned} \frac 1{k!}\,\Lambda_m(f_1,\dots,f_n)=\lambda_2(f_{n-1},f_n)+\lambda_1(f_{n-1})+\lambda_1(f_n)+\lambda_0, \end{aligned} $$

(8.45)

where

$$\displaystyle \begin{aligned} \lambda_2(f,g)&:=\sum_{1\le i_1<\dots<i_{k-2}\le n-2}\overline{m}(f_{i_1},\dots,f_{i_{k-2}},f,g), \\ \lambda_1(f)&:=\sum_{1\le i_1<\dots<i_{k-1}\le n-2}\overline{m}(f_{i_1},\dots,f_{i_{k-1}},f), \\ \lambda_0&:=\sum_{1\le i_1<\dots<i_k\le n-2}\overline{m}(f_{i_1},\dots,f_{i_k}), \\ \end{aligned} $$

Similarly,

$$\displaystyle \begin{aligned} \frac 1{k!}\,\Lambda_m(f_1,\dots,f_{n-2},f_{n-1}\wedge f_n,f_{n-1}\vee f_n) =\lambda_2(f_{n-1}\wedge f_n,f_{n-1}\vee f_n) \\ +\lambda_1(f_{n-1}\wedge f_n)+\lambda_1(f_{n-1}\vee f_n)+\lambda_0. \end{aligned} $$

(8.46)

Note that the function $\lambda _2\colon L^2\to \mathbb {R}$ is 2-additive and permutation-symmetric, and the function $\lambda _1\colon L^2\to \mathbb {R}$ is additive. Take any f and g in L. Then (f ∨ g) ∧ f = f and (f ∨ g) ∖ f = g ∖ f. So, by the additivity of λ ₁ we have λ ₁(f ∨ g) = λ ₁(f) + λ ₁(g ∖ f), whereas λ ₁(f ∧ g) + λ ₁(g ∖ f) = λ ₁(g). So,

$$\displaystyle \begin{aligned} \lambda_1(f\wedge g)+\lambda_1(f\vee g) =\lambda_1(f\wedge g)+\lambda_1(f)+\lambda_1(g\setminus f) = \lambda_1(f)+\lambda_1(g). \end{aligned} $$

(8.47)

By the 2-additivity and permutation symmetry of λ ₂ and because the function λ ₂ is 2-additive, permutation-symmetric, and nonnegative, we have

$$\displaystyle \begin{aligned} \begin{aligned} \lambda_2(f\wedge g,f\vee g) &=\lambda_2(f\wedge g,f\setminus g)+\lambda_2(f\wedge g,g) \\ &=\lambda_2(f\wedge g,f\setminus g)+\lambda_2(f,g)-\lambda_2(f\setminus g,g) \\ &=\lambda_2(f\wedge g,f\setminus g)+\lambda_2(f,g)-\lambda_2(f\setminus g,g\wedge f)-\lambda_2(f\setminus g,g\setminus f) \\ &=\lambda_2(f,g)-\lambda_2(f\setminus g,g\setminus f) \\ &\le\lambda_2(f,g). \end{aligned} \end{aligned} $$

(8.48)

It follows from (8.45), (8.46), (8.47), and (8.48) (with f = f _n−1 and g = f _n) that

$$\displaystyle \begin{aligned} \Lambda_m(f_1,\dots,f_{n-2},f_{n-1}\wedge f_n,f_{n-1}\vee f_n)\le\Lambda_m(f_1,\dots,f_n). \end{aligned}$$

Therefore, being permutation-symmetric, the function Λ_m is indeed generalized -submodular. Hence, by Theorem 8.1.1, Λ_m is generalized n-submodular. □

References

R. Aharoni, U. Keich, A generalization of the Ahlswede-Daykin inequality. Discrete Math. 152(1–3), 1–12 (1996)
Article MathSciNet Google Scholar
R. Ahlswede, D.E. Daykin, Inequalities for a pair of maps S × S → S with S a finite set. Math. Z. 165(3), 267–289 (1979)
Article MathSciNet Google Scholar
F. Bach, Learning with submodular functions: a convex optimization perspective. Found. Trends® Mach. Learn. 6(2–3), 145–373 (2013)
Article Google Scholar
F. Bach, Submodular functions: from discrete to continuous domains. Math. Program. 175(1–2), 419–459 (2019)
Article MathSciNet Google Scholar
C. Borell, A note on an inequality for rearrangements. Pac. J. Math. 47, 39–41 (1973)
Article MathSciNet Google Scholar
G. Choquet, Theory of capacities. Ann. Inst. Fourier 5, 131–295 (1954)
Article MathSciNet Google Scholar
H.A. David, H.N. Nagaraja, Order Statistics. Wiley Series in Probability and Statistics, 3rd edn. (Wiley, Hoboken, 2003)
Book Google Scholar
C.M. Fortuin, P.W. Kasteleyn, J. Ginibre, Correlation inequalities on some partially ordered sets. Commun. Math. Phys. 22, 89–103 (1971)
Article MathSciNet Google Scholar
S. Fujishige, Submodular Functions and Optimization. Annals of Discrete Mathematics, vol. 47 (North-Holland, Amsterdam, 1991)
Google Scholar
G. Grätzer, Lattice Theory: Foundation (Birkhäuser, Basel, 2011)
Google Scholar
G.H. Hardy, J.E. Littlewood, G. Pólya, Inequalities, 2nd edn. (Cambridge University Press, Cambridge, 1952)
MATH Google Scholar
K. Joag-Dev, F. Proschan, Negative association of random variables, with applications. Ann. Stat. 11(1), 286–295 (1983)
Article MathSciNet Google Scholar
S. Karlin, Y. Rinott, Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions. J. Multivariate Anal. 10(4), 467–498 (1980)
MATH Google Scholar
S. Karlin, Y. Rinott, Classes of orderings of measures and related correlation inequalities. II. Multivariate reverse rule distributions. J. Multivariate Anal. 10(4), 499–516 (1980)
MATH Google Scholar
D.E. Knuth, The Art of Computer Programming, vol. 3, 2nd edn. (Addison-Wesley, Reading, 1998). Sorting and searching [MR0445948]
Google Scholar
G.G. Lorentz, An inequality for rearrangements. Am. Math. Mon. 60, 176–179 (1953)
Article MathSciNet Google Scholar
A.W. Marshall, I. Olkin, B.C. Arnold, Inequalities: Theory of Majorization and Its Applications. Springer Series in Statistics, 2nd edn. (Springer, New York, 2011)
Book Google Scholar
P. Milgrom, J. Roberts, Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica 58(6), 1255–1277 (1990)
Article MathSciNet Google Scholar
H. Narayanan, Submodular Functions and Electrical Networks. Annals of Discrete Mathematics, vol. 54 (North-Holland, Amsterdam, 1997)
Google Scholar
I. Pinelis, Optimal binomial, Poisson, and normal left-tail domination for sums of nonnegative random variables. Electron. J. Probab. 21(20), 19 (2016)
Google Scholar
Y. Rinott, M. Saks, On FKG-type and permanental inequalities, in Stochastic Inequalities (Seattle, WA, 1991). IMS Lecture Notes Monograph Series, vol. 22 (Institute of Mathematical Statistics, Hayward, 1992), pp. 332–342
Google Scholar
Y. Rinott, M. Saks, Correlation inequalities and a conjecture for permanents. Combinatorica 13(3), 269–277 (1993)
Article MathSciNet Google Scholar
H.D. Ruderman, Two new inequalities. Am. Math. Mon. 59, 29–32 (1952)
MathSciNet MATH Google Scholar
D.M. Topkis, Equilibrium points in nonzero-sum n-person submodular games. SIAM J. Control Optim. 17(6), 773–787 (1979)
Article MathSciNet Google Scholar
D.M. Topkis, Supermodularity and Complementarity. Frontiers of Economic Research (Princeton University Press, Princeton, 1998).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
Iosif Pinelis

Authors

Iosif Pinelis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Iosif Pinelis .

Editor information

Editors and Affiliations

MAP 5, Université Paris Descartes, Paris, France
Nathael Gozlan
Institute of Mathematics, University of Warsaw, Warsaw, Poland
Rafał Latała
Centre de Mathématiques Appliquées, Ecole Polytechnique, Palaiseau, France
Karim Lounici
Department of Mathematical Sciences, University of Delaware, Newark, DE, USA
Mokshay Madiman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pinelis, I. (2019). Generalized Semimodularity: Order Statistics. In: Gozlan, N., Latała, R., Lounici, K., Madiman, M. (eds) High Dimensional Probability VIII. Progress in Probability, vol 74. Birkhäuser, Cham. https://doi.org/10.1007/978-3-030-26391-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-26391-1_8
Published: 27 November 2019
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-030-26390-4
Online ISBN: 978-3-030-26391-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics