1 Introduction

Although mixed-norm \(L^P\) spaces were described by Benedek and Panzone [3] in 1961, their applications have appeared in the literature at least since Littlewood’s 4/3 inequality [15] in 1930, a fundamental step in bilinearity and a precursor to Grothendieck’s later multilinearity work [13]. This inequality is generalized by the Bohnenblust–Hille inequality, for which recent advances [8] have been achieved through techniques including mixed-norm estimates.

Fournier [10] devised a mixed-norm approach to Sobolev embeddings, followed by the work of authors including Algervik and Kolyada [2], as well as Clavero and Soria [7]. The notion of symmetric mixed-norm spaces is central to this work, so much so that in [7] they are simply called “mixed norm spaces”. That paper uses “Benedek-Panzone spaces” to refer to those spaces which are called mixed-norm spaces in [3] and here. Estimates by geometric means of mixed norms, similarly symmetric in the sense that each mixed norm involved features the same exponents but differently permuted variables, appear frequently in the literature; see [4, 8, 17], and even [15].

Such estimates are useful, but have often been established by tricky inductions on the number of variables, using the classical (one-variable) Hölder’s inequality and Minkowski’s integral inequality. The difficulty of these proofs not only hinders communication, but makes it harder to find strong results. The mixed-norm version of Hölder’s inequality was introduced in [3], but has been developed further since, with generalizations given in the recent expository paper [1]. It can be used together with the mixed-norm form of Minkowski’s integral inequality, introduced in [10], to simplify many arguments, but these techniques have often been overlooked.

In Sect. 2, a general version of Minkowski’s integral inequality for mixed norms is stated and proved. Although this theorem is known, this treatment is more general and detailed than others, and uses notation suited to the main results to follow. (Another description, with different notation, is in the thesis [12], where the appendix gives some of the applications here.) Section 3 provides the main new results, Theorem 3 and Corollary 2, estimates where the upper bounds are symmetric geometric means of mixed norms. These give general embeddings of symmetric mixed-norm spaces into Lebesgue spaces, requiring no more computation than finding harmonic means.

Section 4 shows that various known estimates are simple special cases of these results. Section 5 treats examples where these theorems do not apply, but mixed-norm Hölder’s and Minkowski’s inequalities still simplify the proofs. Finally, Theorem 4 is a new result which combines features of existing estimates in a more complicated inequality, which is nonetheless fairly straightforward to establish with mixed-norm techniques.

In some specific cases, stronger embedding results have been proved than those given here. For example, Fournier’s [10] and, together with Blei, [6] give embeddings into Lorentz spaces \(\ell ^{r,1}\), stronger than the embeddings into \(\ell ^r\) which would be obtained with the methods given here. Milman [16] uses interpolation to produce similar embeddings. Algervik and Kolyada [2] establish embeddings of symmetric mixed-norm spaces into Lorentz spaces, and Clavero and Soria [7] extend this work to more general rearrangement-invariant spaces. But, while powerful, these results tend to be somewhat restricted, requiring that the mixed norms be of a particular form or feature certain exponents. In contrast, the results here apply to general exponents, and may be hoped to lead to stronger future results for Lorentz or other spaces.

2 Mixed-norm Minkowski’s integral inequality

While Minkowski’s integral inequality is fundamentally a mixed-norm inequality in two variables, it has a natural generalization to mixed norms in more variables. Fournier invented a mixed-norm Minkowski’s inequality in [10], giving the key ideas but stating the theorem only for fully-sorted mixed norms. That version is given here as Corollary 1. That paper also coined the term “raises” to describe transpositions. The “raising” and “lowering” properties are given here for more general permutations in Definition 4.

Definition 1

Let \(\left( X_1, \mu _1\right) , \ldots , \left( X_n, \mu _n\right) \) be \(\sigma \)-finite measure spaces, with the product space \(\left( X, \mu \right) \). For any \(p_1, \ldots , p_n \in \left( 0,\infty \right] \), we can define a mixed norm of a measurable function \(f(x_1, \ldots , x_n) : X \rightarrow \mathbf {C}\) by first specifying a double n-tuple

$$\begin{aligned} P = \left( \begin{array}{llll} p_1 &{} p_2 &{} \cdots &{} p_n\\ x_1 &{} x_2 &{} \cdots &{} x_n \end{array}\right) , \end{aligned}$$

in terms of which the mixed norm is

$$\begin{aligned} \left\| f \right\| _P = \left( \int _{X_n} \cdots \left( \int _{X_1} \left| f(x_1, \ldots , x_n) \right| ^{p_1} d\mu _1(x_1) \right) ^{p_2/p_1} \cdots d\mu _n(x_n) \right) ^{1/p_n}, \end{aligned}$$

as long as each \(p_j < \infty \) (for \(j \in \left\{ 1, \ldots , n\right\} \)). As in classical \(L^p\), if any \(p_j = \infty \), replace by the essential supremum in that variable.

Remark 1

\(\left\| \cdot \right\| _P\) is only a norm when every \(p_j \ge 1\); otherwise, the triangle inequality fails. Unless otherwise specified, however, “mixed norm” will be used here to include any \(\left\| \cdot \right\| _P\), even if it is not, strictly speaking, a norm.

Because the value of \(\left\| f \right\| _P\) depends only on the modulus \(\left| f\right| \), we need only consider \(f \ge 0\).

Definition 2

Let \(L^+(X)\) denote the cone of nonnegative measurable functions on X.

Definition 3

If \(\sigma \) is a permutation of \(\left\{ 1, \ldots , n\right\} \) and \(P = {{p_1 \,\cdots \, p_n} \atopwithdelims (){x_1 \,\cdots \, x_n}}\), then let

$$\begin{aligned} P \cdot \sigma = \left( \begin{array}{lllll} p_{\sigma (1)} &{} \cdots &{} p_{\sigma (j)} &{} \cdots &{} p_{\sigma (n)}\\ x_{\sigma (1)} &{} \cdots &{} x_{\sigma (j)} &{} \cdots &{} x_{\sigma (n)} \end{array}\right) . \end{aligned}$$

Extend this to P where the variables are not in numeric order by relabeling the variables.

Remark 2

This defines a right group action of the symmetric group \(S_n\), as for any \(\sigma , \rho \in S_n\),

$$\begin{aligned} (P \cdot \sigma ) \cdot \rho = P \cdot (\sigma \rho ). \end{aligned}$$

Lemma 1

Suppose that \(p_1, \ldots , p_n \in \left( 0, \infty \right] \),

$$\begin{aligned} P = \left( \begin{array}{cccccc} p_1 &{} \cdots &{} p_j &{} p_{j+1} &{} \cdots &{} p_n \\ x_1 &{} \cdots &{} x_j &{} x_{j+1} &{} \cdots &{} x_n\end{array}\right) , \end{aligned}$$

\(1 \le j < n\), and \(p_j \le p_{j+1}\). Let \(\tau \) denote the transposition which swaps j and \(j+1\), fixing all other values in \(\left\{ 1, \ldots , n\right\} \). Then, for any \(f(x_1, \ldots , x_n) \in L^+(X),\)

$$\begin{aligned} \left\| f \right\| _P \le \left\| f \right\| _{P \cdot \tau }. \end{aligned}$$

Proof

Define the function

$$\begin{aligned} g(x_j, \ldots , x_n) = \left( \int _{X_{j-1}} \cdots \left( \int _{X_1} f^{p_1} d\mu _1(x_1) \right) ^{p_2/p_1} \cdots d\mu _{j-1}(x_{j-1}) \right) ^{1/p_{j-1}}, \end{aligned}$$

which computes a mixed norm over the first \(j-1\) variables (if \(j=1\), these are zero variables, so this is interpreted as \(g=f\)), depending on the remaining variables. Fixing \(x_{j+2}, \ldots , x_n\) (i.e. every variable after \(x_{j+1}\)), Minkowski’s integral inequality, applied with the exponent \(\frac{p_{j+1}}{p_j} \ge 1\), shows that

$$\begin{aligned} \left\| g \right\| _{{p_j \, p_{j+1}} \atopwithdelims (){x_j \, x_{j+1}}}&= \left( \int _{X_{j+1}} \left( \int _{X_j} g^{p_j} d\mu _{p_j} \right) ^\frac{p_{j+1}}{p_j} d\mu _{p_{j+1}} \right) ^\frac{1}{p_{j+1}} \\&\le \left( \int _{X_j} \left( \int _{X_{j+1}} g^{p_{j+1}} d\mu _{p_{j+1}} \right) ^\frac{p_j}{p_{j+1}} d\mu _{p_j} \right) ^\frac{1}{p_j} \\&\le \left\| g \right\| _{{p_{j+1} \, p_j} \atopwithdelims (){x_{j+1} \, x_j}}. \end{aligned}$$

This can be interpreted as an inequality of functions of \(x_{j+2}, \ldots , x_n\). Both the integral and essential supremum are order-preserving on nonnegative functions. Consequently, if \(0 \le f_1 \le f_2\), then for any \(L^p\) or mixed norm \(\left\| \cdot \right\| \), \(\left\| f_1 \right\| \le \left\| f_2 \right\| .\)

Therefore we can apply the mixed norm \({p_{j+2} \, \cdots \, p_n} \atopwithdelims (){x_{j+2} \, \cdots \, x_n}\) in the remaining variables to both sides above, yielding

$$\begin{aligned} \left\| f \right\| _P = \left\| \left\| g \right\| _{{p_j \, p_{j+1}} \atopwithdelims (){x_j \, x_{j+1}}} \right\| _{{p_{j+2} \,\cdots \, p_n} \atopwithdelims (){x_{j+2} \,\cdots \, x_n}} \le \left\| \left\| g \right\| _{{p_{j+1} \, p_j} \atopwithdelims (){x_{j+1} \, x_j}} \right\| _{{p_{j+2} \,\cdots \, p_n} \atopwithdelims (){x_{j+2} \,\cdots \, x_n}} = \left\| f \right\| _{P \cdot \tau }. \end{aligned}$$

\(\square \)

Definition 4

With

$$\begin{aligned} P = \left( \begin{array}{ccc} p_1 &{} \ldots &{} p_n\\ x_1 &{} \ldots &{} x_n \end{array}\right) , \end{aligned}$$

a permutation \(\sigma \) raises P if \(p_i~\le ~p_j\) whenever \(i~<~j\) and \(\sigma ^{-1}(j)~<~\sigma ^{-1}(i)\). Similarly, a permutation \(\sigma \) lowers P if \(p_j~\le ~p_i\) whenever \(i~<~j\) and \(\sigma ^{-1}(j)~<~\sigma ^{-1}(i)\).

Remark 3

An adjacent transposition \(\tau = (\begin{array}{ll}j&j+1\end{array})\) raises

$$\begin{aligned} P \cdot \sigma = \left( \begin{array}{ccc} p_{\sigma (1)} &{} \cdots &{} p_{\sigma (n)} \\ x_{\sigma (1)} &{} \cdots &{} x_{\sigma (n)} \end{array}\right) \end{aligned}$$

if and only if \(p_{\sigma (j)} \le p_{\sigma (j+1)}\). Similarly, this \(\tau \) lowers \(P \cdot \sigma \) if and only if \(p_{\sigma (j+1)} \le p_{\sigma (j)}\).

Lemma 2

A permutation \(\sigma \) raises P if and only if \(\sigma ^{-1}\) lowers \(P \cdot \sigma \). (Equivalently, \(\sigma \) lowers P if and only if \(\sigma ^{-1}\) raises \(P \cdot \sigma \).)

Proof

As defined, \(\sigma \) raises P if and only if \(p_i \le p_j\) whenever \(i < j\) and \(\sigma ^{-1}(j) < \sigma ^{-1}(i)\). Let \(b = \sigma ^{-1}(i)\) and \(a = \sigma ^{-1}(j)\), and observe that this is equivalent to saying that \(p_{\sigma (b)} \le p_{\sigma (a)}\) whenever \(a < b\) and \(\sigma (b) < \sigma (a)\), i.e. that \(\sigma ^{-1}\) lowers \(P \cdot \sigma \).

To see that the second formulation is equivalent, just swap \(\sigma \) and \(\sigma ^{-1}\), P and \(P \cdot \sigma \), and note that \(P \cdot \sigma \cdot \sigma ^{-1} = P\). \(\square \)

Lemma 3

If \(\sigma \) raises P and \(\rho \) raises \(P \cdot \sigma \), then \(\sigma \rho \) raises P. Similarly, if \(\sigma \) lowers P and \(\rho \) lowers \(P \cdot \sigma \), then \(\sigma \rho \) lowers P.

Proof

Suppose that \(\sigma \) raises P and that \(\rho \) raises \(P \cdot \sigma \). Consider any \(i < j\) such that \((\sigma \rho )^{-1}(j) < (\sigma \rho )^{-1}(i)\).

If \(\sigma ^{-1}(j) < \sigma ^{-1}(i)\), then \(p_i \le p_j\), because \(\sigma \) raises P and \(i < j\). Otherwise, \(\sigma ^{-1}(i) < \sigma ^{-1}(j)\) and \(\rho ^{-1}(\sigma ^{-1}(j)) < \rho ^{-1}(\sigma ^{-1}(i))\). Because \(\rho \) raises \(P \cdot \sigma \), this means that \(\left( P \cdot \sigma \right) _{\sigma ^{-1}(i)} \le \left( P \cdot \sigma \right) _{\sigma ^{-1}(j)}\), i.e. \(p_i = p_{\sigma (\sigma ^{-1}(i))} \le p_{\sigma (\sigma ^{-1}(j))} = p_j\).

Either way, \(p_i \le p_j\), so \(\sigma \rho \) raises P.

Next, assume that \(\sigma \) lowers P and \(\rho \) lowers \(P \cdot \sigma \). By Lemma 2, this means that \(\rho ^{-1}\) raises \((P \cdot \sigma ) \cdot \rho = P \cdot \sigma \rho \), and \(\sigma ^{-1}\) raises \(P \cdot \sigma \). By the previous part of this lemma, \(\rho ^{-1}\sigma ^{-1}\) raises \(P\cdot \sigma \rho \). Applying Lemma 2 again, this means that \(\sigma \rho \) lowers P, as desired. \(\square \)

Theorem 1

Any permutation raises P if and only if it is a composition \(\tau _1 \cdots \tau _m\) (for some \(m \ge 0\)) of adjacent transpositions such that, for each \(1 \le k \le m\), \(\tau _k\) raises \(P \cdot \tau _1 \cdots \tau _{k-1}\).

Similarly, any permutation lowers P if and only if it is a composition of adjacent transpositions \(\tau _1 \cdots \tau _m\) such that each \(\tau _k\) lowers \(P \cdot \tau _1 \cdots \tau _{k-1}\).

Proof

If \(\sigma = \tau _1 \cdots \tau _m\) is a composition as specified, each \(\tau _k\) raising (or lowering) \(P \cdot \tau _1 \cdots \tau _{k-1}\), then \(\sigma \) raises (or lowers) P, by Lemma 3.

Now suppose that \(\sigma \) raises P. The proof that it is a composition of adjacent transpositions as above is by induction on the number of inversions in \(\sigma \), i.e. the number of pairs \(i < j\) such that \(\sigma (j) < \sigma (i)\). As a base case, the identity is an empty composition. It is impossible to have \(\sigma (1) \le \cdots \le \sigma (n)\) unless \(\sigma \) is the identity, so any non-identity \(\sigma \) must have at least one inverted adjacent pair, say \(\sigma (k+1) < \sigma (k)\).

Let \(a = \sigma (k+1)\) and \(b = \sigma (k)\) and note that \(a < b\) and \(\sigma ^{-1}(b) < \sigma ^{-1}(a)\), so because \(\sigma \) raises P, \(p_a \le p_b\). Let \(\tau = \left( \begin{array}{ll} k&k+1\end{array}\right) \) and observe that

$$\begin{aligned} P \cdot \sigma \tau = \left( \begin{array}{llllllll} p_{\sigma (1)} &{} \cdots &{} p_{\sigma (k-1)} &{} p_a &{} p_b &{} p_{\sigma (k+2)} &{} \cdots &{} p_{\sigma (n)} \\ x_{\sigma (1)} &{} \cdots &{} x_{\sigma (k-1)} &{} x_a &{} x_b &{} x_{\sigma (k+2)} &{} \cdots &{} x_{\sigma (n)} \end{array}\right) . \end{aligned}$$

For any pair \(i < j\), \(\sigma (i)\) and \(\sigma (j)\) are in the same relative order as \(\sigma \tau (i)\) and \(\sigma \tau (j)\) unless the pair consists of k and \(k+1\). Because \(\sigma \tau (k) = \sigma (k+1) < \sigma (k) = \sigma \tau (k+1)\) and \(\sigma \) raises P, \(\sigma \tau \) also raises P. Since \(\sigma \tau \) has one fewer inversion than \(\sigma \) (as \(\sigma \tau (k) = a < b = \sigma \tau (k+1)\)), by the inductive hypothesis, there are adjacent transpositions \(\tau _1, \ldots , \tau _m\) such that \(\sigma \tau = \tau _1 \cdots \tau _m\) and each \(\tau _k\) raises \(P \cdot \tau _1 \cdots \tau _{k-1}\).

Finally, \(\tau = \left( \begin{array}{cc} k&k+1\end{array}\right) \) raises \(P \cdot \sigma \tau \), because \(a < b\), \(\tau ^{-1}(b) = k < k+1 = \tau ^{-1}(a)\), and \(p_a \le p_b\). Therefore we let \(\tau _{m+1} = \tau \) and have \(\sigma = \tau _1 \cdots \tau _{m+1}\) as desired.

Now, if \(\sigma \) lowers P, then \(\sigma ^{-1}\) raises \(P \cdot \sigma \) by Lemma 2. The preceding characterization shows that \(\sigma ^{-1} = \tau _1 \cdots \tau _m\) as a composition of adjacent transpositions, where each \(\tau _k\) raises \(P \cdot \sigma \tau _1 \cdots \tau _{k-1} = P \cdot \tau _m \cdots \tau _k\). Therefore \(\sigma = \tau _m \cdots \tau _1\), where by Lemma 2 each \(\tau _k = \tau _k^{-1}\) lowers \(P \cdot \tau _m \cdots \tau _{k+1}\). This is the desired result, up to relabeling each \(\tau _k\) as \(\tau _{m-k}\). \(\square \)

Theorem 2

(Mixed-norm Minkowski’s integral inequality) If \(\sigma \) is a permutation which raises P, then for any \(f \in L^+(X)\), \(\left\| f \right\| _P \le \left\| f \right\| _{P \cdot \sigma }\).

Similarly, if \(\rho \) lowers P, then for any \(f \in L^+(X)\), \(\left\| f \right\| _{P \cdot \rho } \le \left\| f \right\| _P\).

Proof

Suppose that \(\sigma \) raises P. Use Theorem 1 to write \(\sigma = \tau _1 \cdots \tau _m\), a composition of adjacent transpositions, where each \(\tau _k\) raises \(P \cdot \tau _1 \cdots \tau _{k-1}\). The adjacent transposition \(\tau _1\) can be expressed as \(\left( \begin{array}{cc} j&j+1\end{array}\right) \) for some \(1 \le j_k < n\). As noted in Remark 3, \(p_j \le p_{j+1}\) because the adjacent transposition \(\tau _1\) raises P. (In other words, \(\tau _1\) swaps adjacent exponents in P so as to move the larger one, \(p_{j+1}\), to a position earlier in the list.) By Lemma 1, for any \(f \in L^+(X)\),

$$\begin{aligned} \left\| f \right\| _P \le \left\| f \right\| _{P \cdot \tau _1}. \end{aligned}$$

Similarly, since each adjacent transposition \(\tau _k\) raises \(P \cdot \tau _1 \cdots \tau _{k-1}\), it swaps adjacent exponents in \(P \cdot \tau _1 \cdots \tau _{k-1}\) so as to move the larger one earlier in the list. By Lemma 1, this means that, for any \(f \in L^+(X)\),

$$\begin{aligned} \left\| f \right\| _{P \cdot \tau _1 \cdots \tau _{k-1}} \le \left\| f \right\| _{P \cdot \tau _1 \cdots \tau _k}. \end{aligned}$$

Taken together, all these steps give

$$\begin{aligned} \left\| f \right\| _P \le \left\| f \right\| _{P \cdot \tau _1} \le \cdots \le \left\| f \right\| _{P \cdot \tau _1 \cdots \tau _m} = \left\| f \right\| _{P \cdot \sigma }. \end{aligned}$$

The proof when \(\rho \) lowers P is similar, with the inequalities reversed. \(\square \)

Corollary 1

(Fournier’s fully-sorted Minkowski) Let

$$\begin{aligned} P = \left( \begin{array}{lll} p_1 &{} \cdots &{} p_n \\ x_1 &{} \cdots &{} x_n \end{array}\right) , \end{aligned}$$

and let \(\sigma , \rho \in S_n\) be permutations such that

$$\begin{aligned} p_{\sigma (1)} \ge p_{\sigma (2)} \ge \cdots \ge p_{\sigma (n)} \text { and } p_{\rho (1)} \le p_{\rho (2)} \le \cdots \le p_{\rho (n)}. \end{aligned}$$

Then, for any \(f \in L^+(X)\),

$$\begin{aligned} \left\| f \right\| _{P \cdot \rho } \le \left\| f \right\| _P \le \left\| f \right\| _{P \cdot \sigma }. \end{aligned}$$

Proof

Any list can be sorted by adjacent swaps of out-of-order elements; see, for example, the bubble sort algorithm, as described in [14, pp. 106–111]. Such sorting of the exponents into numeric order takes P to \(P \cdot \rho \), for some \(\rho \in S_n\) which lowers P, as defined in Definition 4. Sorting into reverse numeric order takes P to some \(P \cdot \sigma \), where \(\sigma \) raises P.

By the mixed-norm version of Minkowski’s integral inequality from Theorem 2,

$$\begin{aligned} \left\| f \right\| _{P \cdot \rho } \le \left\| f \right\| _P \le \left\| f \right\| _{P \cdot \sigma }. \end{aligned}$$

\(\square \)

3 Estimates with symmetric geometric means of mixed norms

Again, let \((X_1, \mu _1), \ldots , (X_n, \mu _n)\) be \(\sigma \)-finite measure spaces with product \((X, \mu )\). Recall the mixed-norm Hölder’s inequality given by Benedek and Panzone early in [3]. (This theorem can be proved by applying the m-function Hölder’s inequality in each variable successively.)

Proposition 1

(Mixed-norm Hölder’s inequality) Let \(f_1, \ldots , f_m \in L^+(X)\) be any functions, with corresponding double n-tuples \(P_1, \ldots , P_m\), each

$$\begin{aligned} P_i = \left( \begin{array}{lll} p_{i,1} &{} \cdots &{} p_{i,n}\\ x_1 &{} \cdots &{} x_n \end{array}\right) \end{aligned}$$
(1)

such that \(\sum _{i=1}^m P_i^{-1} = 1\), understood coordinatewise. That is, for each \(j \in \left\{ 1, \ldots , n\right\} \), \(\sum _{i=1}^m p_{i,j}^{-1} = 1.\) Then

$$\begin{aligned} \int _X f_1 \cdots f_m d\mu \le \left\| f_1 \right\| _{P_1} \cdots \left\| f_m \right\| _{P_m}. \end{aligned}$$

Note that, while useful, the above theorem is only the beginning of mixed-norm generalizations of Hölder’s inequality. Aside from the estimates given in this paper, see [1] (especially its Theorems 3.1 and 3.2) for other flexible estimates, some even with a possible mixed norm on the left-hand side rather than classical \(L^p\). The results are stated there for functions on products of finite spaces \(\left\{ 1, \ldots , n\right\} \), but can be proved for any \(\sigma \)-finite spaces using elementary methods.

Definition 5

Given

$$\begin{aligned} P = \left( \begin{array}{lll} p_1 &{} \cdots &{} p_n \\ x_1 &{} \cdots &{} x_n \end{array}\right) , \end{aligned}$$

denote the harmonic mean of the exponents in P by

$$\begin{aligned} \overline{p} = \left( \frac{1}{n} \sum _{j=1}^n p_j^{-1} \right) ^{-1}. \end{aligned}$$

Definition 6

Define two more right actions of the symmetric group \(S_n\) by, for any \(\sigma \in S_n\), letting

$$\begin{aligned} P^\sigma = \left( \begin{array}{ccc} p_{\sigma (1)} &{} \cdots &{} p_{\sigma (n)} \\ x_1 &{} \cdots &{} x_n\end{array}\right) \text{ and } P_\sigma = \left( \begin{array}{ccc} p_1 &{} \cdots &{} p_n \\ x_{\sigma (1)} &{} \cdots &{} x_{\sigma (n)}\end{array}\right) . \end{aligned}$$

Definition 7

From now on, let m denote the size of the orbit \(\left\{ P^\sigma : \sigma \in S_n \right\} \) of P.

Remark 4

If the exponents \(\left\{ p_1, \ldots , p_n\right\} \) have r many distinct values \(v_1, \ldots , v_r\), such that each value \(v_k\) occurs \(n_k\) many times, then

$$\begin{aligned} m = \frac{n!}{n_1! \cdots n_r!}. \end{aligned}$$

Theorem 3

Given a fixed P, let its orbit \(\left\{ P^\sigma : \sigma \in S_n\right\} \) be enumerated by \(P_1, \ldots , P_m\). For any functions \(f_1, \ldots , f_m \in L^+(X)\),

$$\begin{aligned} \left\| \prod _{i=1}^m f_i^{1/m} \right\| _{L^{\overline{p}}(X)} \le \prod _{i=1}^m \left\| f_i \right\| _{P_i}^{1/m}. \end{aligned}$$

Proof

This result is trivial if all exponents are the same, with both sides \(L^{\overline{p}}(X)\) norms of a single function. Therefore assume this is not the case, implying in particular that \(\overline{p} < \infty \) and that \(m \ge n\).

For each \(1 \le i \le m\), let

$$\begin{aligned} Q_i = \left( \begin{array}{ccc} m p_{i,1} / \overline{p} &{} \cdots &{} m p_{i,n} / \overline{p} \\ x_1 &{} \cdots &{} x_n\end{array}\right) , \end{aligned}$$

with \(P_i\) as in Eq. 1. Observe that, for each i and any \(1 \le j \le n\), \(m p_{i,j} / \overline{p} \ge 1\), because since \(m \ge n\),

$$\begin{aligned} \frac{m p_{i,j}}{\overline{p}} = \frac{m}{n} p_{i,j} \sum _{k=1}^n p_{k,j}^{-1} \ge \frac{m}{n} \left( 1 + \sum _{k \ne i} \frac{p_{i,j}}{p_{k,j}}\right) \ge 1. \end{aligned}$$

Furthermore, \(\sum _{i=1}^m Q_i^{-1} = 1\) coordinatewise. To see this, fix any \(l \in \left\{ 1, \ldots , n\right\} \) and \(k \in \left\{ 1, \ldots , r\right\} \). The number of \(P^\sigma \) in the orbit of P which place the value \(v_k\) (which appears \(n_k\) times in the top row of P) in the lth position is then

$$\begin{aligned} \frac{(n-1)!}{n_1! \cdots n_{k-1}! (n_k-1)! n_{k+1}! \cdots n_r!} = \frac{n_k}{n} m, \end{aligned}$$

recalling the formula given in Remark 4 for m. Therefore

$$\begin{aligned} \sum _{i=1}^m \frac{\overline{p}}{m p_{i,l}} = \frac{\overline{p}}{m} \sum _{i=1}^m p_{i,l}^{-1} = \frac{\overline{p}}{n} \sum _{k=1}^r \frac{n_k}{v_k} = \frac{\overline{p}}{n} \sum _{j=1}^n p_j^{-1} = 1, \end{aligned}$$

by the definition of \(\overline{p}\), so Proposition 1 (Hölder’s inequality) can be applied to the functions \(f_1^{\overline{p}/m}, \ldots , f_m^{\overline{p}/m}\), yielding

$$\begin{aligned} \int _X \prod _{i=1}^m f_i^{\overline{p}/m} \le \prod _{i=1}^m \Vert f_i^{\overline{p}/m} \Vert _{Q_i} = \prod _{i=1}^m \Vert f_i \Vert _{P_i}^{\overline{p}/m}. \end{aligned}$$

Take the \(\overline{p}\) root of each side for the desired result. \(\square \)

One mixed norm may be defined by several different double n-tuples. For example, if

$$\begin{aligned} P_1 = \left( \begin{array}{ccc} 3 &{} 2 &{} 2 \\ x_1 &{} x_2 &{} x_3\end{array}\right) \text { and } P_2 = \left( \begin{array}{ccc} 3 &{} 2 &{} 2 \\ x_1 &{} x_3 &{} x_2\end{array}\right) , \end{aligned}$$

then for any measurable \(f(x_1, x_2, x_3) \ge 0\),

$$\begin{aligned} \left\| f \right\| _{P_1} = \left( \int _{X_2 \times X_3} \left( \int _{X_1} f^3 d\mu _1 \right) ^{2/3} d(\mu _2 \times \mu _3) \right) ^{1/2} = \left\| f \right\| _{P_2} \end{aligned}$$

by Tonelli’s theorem. (Tonelli’s theorem is a variation on Fubini’s theorem, which applies to nonnegative functions but does not require integrability. See such texts as [9], where it appears as Theorem 2.37(a).)

In general, the order of the variables associated with consecutive repeated exponents does not change the norm. (In this example, the order of \(x_2\) and \(x_3\) is immaterial.) Therefore, we identify any double n-tuples which differ only in the order of variables within such blocks of repeated exponents. With this identification, as long as P satisfies \(p_1 \ge \cdots \ge p_n\), a simple counting argument shows that the orbit \(\left\{ P_\sigma : \sigma \in S_n\right\} \) has the same number of elements m (from Definition 7, computed in Remark 4) as the orbit \(\left\{ P^\sigma : \sigma \in S_n\right\} \).

Furthermore, whenever \(p_1 \ge \cdots \ge p_n\), P is maximal in its orbit for Theorem 2 (Minkowski’s inequality for mixed norms), in the sense that for each \(\sigma \in S_n\), \(\left\| f \right\| _{P \cdot \sigma } \le \left\| f \right\| _P\) for any \(f \in L^+(X)\). These two properties lead to the following result. Although it closely resembles Theorem 3, from which it is derived, note that here we consider the double n-tuples \(P_\sigma \) rather than \(P^\sigma \). This means that, while Theorem 3 permutes the exponents while leaving the order of the variables fixed, here the exponents keep their order while the variables are permuted.

Corollary 2

Given a fixed P with \(p_1 \ge \cdots \ge p_n\), let its orbit \(\left\{ P_\sigma : \sigma \in S_n\right\} \), modulo the above identification, be enumerated by \(P_1, \ldots , P_m\). For any functions \(f_1, \ldots , f_m \in L^+(X)\),

$$\begin{aligned} \left\| \prod _{i=1}^m f_i^{1/m} \right\| _{L^{\overline{p}}(X)} \le \prod _{i=1}^m \left\| f_i \right\| _{P_i}^{1/m}. \end{aligned}$$

Proof

For each \(P_\sigma \) in the orbit \(\left\{ P_\sigma : \sigma \in S_n\right\} \), there is a corresponding \(P^{\sigma ^{-1}} = P_\sigma \cdot \sigma ^{-1}\) in the other orbit, \(\left\{ P^\sigma : \sigma \in S_n\right\} \). Let \(Q_1, \ldots , Q_m\) be obtained from \(P_1, \ldots , P_m\) in this way; that is, writing each \(P_i = P_{\sigma _i}\), the corresponding \(Q_i = P^{\sigma _i^{-1}}\). These \(Q_i\) enumerate the collection of \(P^{\sigma ^{-1}}\), which is in fact the orbit \(\left\{ P^\sigma : \sigma \in S_n\right\} \).

By Theorem 3,

$$\begin{aligned} \left\| \prod _{i=1}^m f_i^{1/m} \right\| _{L^{\overline{p}}(X)} \le \prod _{i=1}^m \left\| f_i \right\| _{Q_i}^{1/m}. \end{aligned}$$

Because each \(P_i\) can be obtained from \(Q_i\) by sorting its columns so that the exponents are in decreasing order, by Corollary 1, each \(\left\| f_i\right\| _{Q_i} \le \left\| f_i\right\| _{P_i}\). \(\square \)

Corollary 3

Given a fixed P with \(p_1 \ge \cdots \ge p_n\), let its orbit \(\left\{ P_\sigma : \sigma \in S_n\right\} \) be enumerated by \(P_1, \ldots , P_m\). For any \(f \in L^+(X)\),

$$\begin{aligned} \left\| f \right\| _{L^{\overline{p}}(X)} \le \prod _{i=1}^m \left\| f \right\| _{P_i}^{1/m}. \end{aligned}$$

Proof

Simply apply Corollary 2 with each \(f_i = f\). \(\square \)

Remark 5

The exponent \(\overline{p}\) on the left-hand side of the inequality in each of Theorem 3 and Corollaries 2 and 3 is the only exponent p such that the result is valid for all \(\sigma \)-finite measure spaces, even allowing a constant C (depending on the spaces, but not the functions \(f_i\)) such that

$$\begin{aligned} \left\| \prod _{i=1}^m f_i^{1/m} \right\| _{L^p(X)} \le C \prod _{i=1}^m \left\| f_i \right\| _{P_i}^{1/m}. \end{aligned}$$

(Consider \(X_1 = \cdots = X_n = \mathbf {R}\) and each \(f_1 = \cdots = f_m = \prod _{j=1}^n \chi _{[0,t]}(x_j)\), then take limits \(t \rightarrow 0\) and \(t \rightarrow \infty \). Similar examples are possible in any spaces featuring sets of arbitrarily small and arbitrarily large measure.)

As an additional note, when using either of Corollaries 2 and 3, it suffices to specify only the top row as an n-tuple \(\left( p_1, \ldots , p_n\right) \) with \(p_1 \ge \cdots \ge p_n\), for this is enough to determine both the orbit \(\left\{ P_\sigma : \sigma \in S_n\right\} \) and \(\overline{p}\).

4 Applications of main results

These results provide an easy way to generate mixed-norm estimates, where most of the computational work is finding the harmonic mean \(\overline{p}\). Many estimates in the literature are simple consequences of Theorem 3 and Corollary 2, and can now be easily proved and generalized.

Perhaps the simplest application is a mixed-norm intermediate result to Littlewood’s 4/3 inequality, a fundamental step in the theory of multilinearity, and an early example of the importance of \(L^p\) for exponents p other than the ubiquitous 1, 2, and \(\infty \). One modern source describing Littlewood’s 4/3 inequality is Garling’s book [11], where the proof of the inequality, there Corollary 18.1.1, establishes and uses this mixed-norm estimate.

As with many of these sorts of results, the original was given for sums, but these methods easily generalize it to integrals.

Proposition 2

For any \(\sigma \)-finite measure spaces \((X, \mu )\) and \((Y, \nu )\) and any function \(f(x,y) \in L^+(X \times Y)\),

$$\begin{aligned} \left( \int _{X \times Y} f^{\frac{4}{3}} d\mu d\nu \right) ^{\frac{3}{4}}&\le \left( \displaystyle \int _Y \left( \displaystyle \int _X f^2 d\mu \right) ^{\frac{1}{2}} d\nu \right) ^{\frac{1}{2}} \left( \displaystyle \int _X \left( \displaystyle \int _Y f^2 d\nu \right) ^{\frac{1}{2}} d\mu \right) ^{\frac{1}{2}}. \end{aligned}$$

Proof

Use Corollary 3 with \(P = {{2 \, 1} \atopwithdelims (){x \, y}}\), so \(\overline{p} = \left( \frac{2^{-1} + 1^{-1}}{2} \right) ^{-1} = \frac{4}{3}\). \(\square \)

Blei gives a similar 6/5 inequality with three variables in Lemma 2 on page 430 of [5], again stated for series but easily generalized to integrals on any \(\sigma \)-finite spaces. To produce and prove this result, simply apply Corollary 3 with \(P = \left( 2, 1, 1\right) \), so \(\overline{p} = 6/5\).

These results find a generalization in Blei’s Lemma 5.3 from [4], which considers exponents 2 and 1, each appearing arbitrarily often. A special case of this mixed-norm estimate was used as Lemma 1 in [8], a paper using multilinear techniques to study the Bohnenblust–Hille inequality. Preliminary definitions are followed by a generalization of Blei’s result from sums to integrals.

Definition 8

Consider integers \(J> K > 0\). Let \(N = {\left( {\begin{array}{c}J\\ K\end{array}}\right) }\) and let \(S_1, \ldots , S_N\) enumerate the subsets of \(\left\{ 1, \ldots , J\right\} \) with cardinality K. For \(1~\le ~\alpha ~\le ~N\), let \(\sim S_\alpha \) denote the complement \(\left\{ 1, \ldots , J\right\} {\setminus } S_\alpha \).

Proposition 3

For any \(\sigma \)-finite measure spaces \((X_1, \mu _1), \ldots , (X_J, \mu _J)\) and any measurable function \(f(x_1, \ldots , x_J)\) on \(X_1 \times \cdots \times X_J\),

$$\begin{aligned} \left( \int _{\left\{ 1, \ldots , J\right\} } \left| f\right| ^\frac{2J}{K+J} \right) ^\frac{K+J}{2J} \le \prod _{\alpha =1}^N \left[ \int _{S_\alpha } \left( \int _{\sim S_\alpha } |f|^2 \right) ^{1/2} \right] ^{1/N}, \end{aligned}$$

where for any subset \(E \subset \left\{ 1, \ldots , J\right\} \), the notation \(\int _E\) denotes integration over the product space \(\prod _{k \in E} X_k\).

Proof

To prepare for Corollary 3, let \(P = \left( \begin{array}{llllll} 2&\cdots&2&1&\cdots&1 \end{array}\right) \), with K copies of 1 and \(J-K\) copies of 2. There are exactly \({\left( {\begin{array}{c}J\\ K\end{array}}\right) }\) norms in the orbit of P, because each such norm is determined by choosing K variables to place with the 1 exponents. The K indices of these variables form a subset \(S_\alpha \) of \(\left\{ 1, \ldots , J\right\} \). With the remaining variables, in \(\sim S_\alpha \), associated with the exponent 2, we form a mixed norm \(P_\alpha \) such that

$$\begin{aligned} \left\| f \right\| _{P_\alpha } = \int _{S_\alpha } \left( \int _{\sim S_\alpha } |f|^2 \right) ^{1/2}. \end{aligned}$$

With K copies of 1 and \(J-K\) copies of 2, the harmonic mean is

$$\begin{aligned} \overline{p} = \left( \frac{ K + \frac{1}{2}(J-K)}{J} \right) ^{-1} = \frac{2J}{K+J}, \end{aligned}$$

so the desired result follows from Corollary 3. \(\square \)

Blei’s method of proof rests on the same foundation, the inequalities of Hölder and Minkowski, but takes three pages for an induction using the single-variable Hölder’s inequality rather than using mixed-norm techniques. Not only do we have a quicker and easier proof, but it is now straightforward to find generalizations beyond the exponents 1 and 2.

Proposition 4

For any \(0< p < q \le \infty \), \(\sigma \)-finite measure spaces \((X_1, \mu _1), \ldots , (X_J, \mu _j)\), and any measurable function \(f(x_1, \ldots , x_J)\) on \(X_1 \times \cdots \times X_J\),

$$\begin{aligned} \left\| f \right\| _{\frac{Jpq}{pJ + (q-p)K}} \le \prod _{\alpha =1}^N \left[ \int _{S_\alpha } \left( \int _{\sim S_\alpha } |f|^q \right) ^{p/q} \right] ^{1/Np}, \end{aligned}$$

where for any subset \(E \subset \left\{ 1, \ldots , J\right\} \), the notation \(\int _E\) denotes integration over the product space \(\prod _{k \in E} X_k\).

Proof

Let \(P = \left( \begin{array}{llllll} q&\cdots&q&p&\cdots&p \end{array}\right) \), with K copies of p and \(J-K\) copies of q. The harmonic mean is

$$\begin{aligned} \overline{p} = \left( \frac{p^{-1}K + q^{-1}(J-K)}{J}\right) ^{-1} = \frac{Jpq}{Jp + K(q-p)}, \end{aligned}$$

and the argument is otherwise like the proof of Proposition 3. \(\square \)

This technique could easily produce similar results using three or more distinct exponents, but Corollary 3 already addresses arbitrarily many.

Remark 6

Each of Propositions 2, 3, and 4 can be easily generalized to use several functions rather than one, simply by applying Corollary 2 rather than Corollary 3.

5 Other mixed-norm estimates

Although Theorem 3 and Corollary 2 offer rather polished results, not every situation calls for these estimates. However, the mixed-norm Hölder’s and Minkowski’s inequalities can be used in other ways, perhaps combined with different techniques. For example, neither Theorem 3 nor Corollary 2 yields Theorems 2.1 and 2.2 in [17], but the inductive proofs given can be replaced with much simpler mixed-norm methods. The result follows after suitable definitions.

Definition 9

For \(j=1, 2, \ldots , n\), let \((M_j, \mu _j)\) be \(\sigma \)-finite measure spaces and define the product measure spaces \((M^n, \mu ^n)\) and \((M^n_j, \mu ^n_j)\) by

$$\begin{aligned} M^n = \prod _{k=1}^n M_k, \qquad \mu ^n = \prod _{k=1}^n \mu _k, \qquad M^n_j = \prod _{\mathop {k \ne j}\limits ^{k=1}}^n M_k, \qquad \mu ^n_j = \prod _{\mathop {k \ne j}\limits ^{k=1}}^n \mu _k, \end{aligned}$$

Proposition 5

(Theorems 2.1 and 2.2 in [17]) If \(n \ge 2\) and \(q_1, \ldots , q_n\) are positive (possibly infinite) exponents such that \(\sum _{j=1}^n \frac{1}{q_j} \le 1\), then for any nonnegative \(\mu ^n\)-measurable functions \(f_1, \ldots , f_n\),

$$\begin{aligned} \int _{M^n} f_1 \cdots f_n d\mu ^n&\le \prod _{j=1}^n \left( \int _{M_j} \left( \int _{M^n_j} f_j^{q_j} d\mu ^n_j \right) ^{p_j/q_j} d\mu _j \right) ^{1/p_j} \end{aligned}$$
(2)
$$\begin{aligned} \text { and } \int _{M^n} f_1 \cdots f_n d\mu ^n&\le \prod _{j=1}^n \left( \int _{M^n_j} \left( \int _{M_j} f_j^{q_j} d\mu _j \right) ^{s_j/q_j} d\mu ^n_j \right) ^{1/s_j}, \end{aligned}$$
(3)

where \(\frac{1}{p_j} = \frac{1}{q_j} + 1 - \sum _{k=1}^n \frac{1}{q_k}\) and \(\frac{1}{s_j} = \frac{1}{q_j} + \frac{1}{n-1} (1 - \sum _{k=1}^n \frac{1}{q_k})\).

Proof

To prove the first inequality, define, for each \(1 \le j \le n\),

$$\begin{aligned} P_j = \left( \begin{array}{ccc} p_{j,1} &{} \cdots &{} p_{j,n} \\ x_1 &{} \cdots &{} x_n\end{array}\right) , \end{aligned}$$

where each \(p_{j,j} = p_j\) and, for \(j \ne k\), \(p_{j,k} = q_j\). The hypotheses ensure that every \(p_{j,k} \ge 1\) and that \(\sum _{j=1}^n P_j^{-1} = 1\) coordinatewise, i.e. for each \(1 \le k \le n\), \(\sum _{j=1}^n \frac{1}{p_{j,k}} = 1\). Therefore Hölder’s inequality (Proposition 1) gives

$$\begin{aligned} \int _{M^n} f_1 \cdots f_n d\mu ^n \le \prod _{j=1}^n \left\| f \right\| _{P_j}. \end{aligned}$$

Because each \(p_j \le q_j\), Minkowski’s inequality (Corollary 1) gives inequality 2, where each \(L^{p_j}_{\mu _j}\) norm over \(X_j\) comes last.

For the second inequality, let

$$\begin{aligned} S_j = \left( \begin{array}{ccc} s_{j,1} &{} \cdots &{} s_{j,n} \\ x_1 &{} \cdots &{} x_n\end{array}\right) , \end{aligned}$$

where each \(s_{j,j} = q_j\) and, for \(j \ne k\), \(s_{j,k} = s_j\). Again, each \(s_{j,k} \ge 1\) and \(\sum _{j=1}^n S_j^{-1}\) coordinatewise. By Hölder’s inequality,

$$\begin{aligned} \int _{M^n} f_1 \cdots f_n d\mu ^n \le \prod _{j=1}^n \left\| f \right\| _{S_j}. \end{aligned}$$

Now Minkowski’s integral inequality gives inequality 3, because each \(s_j \le q_j\). \(\square \)

Mixed-norm techniques offer not only easy proofs of known inequalities, but often a simple route to generalizations, as well. For example, the following inequality features coefficients \(q_i\) which do not quite satisfy the Hölder criterion (with the gap filled by \(p_i\)), as in Proposition 5, drawn from Popa and Sinnamon [17]. However, it combines this feature with the variable-sized subsets present in Proposition 3, based on Blei [4].

We resume our initial notation, where \((X_1, \mu _1) \ldots , (X_n, \mu _n)\) are any \(\sigma \)-finite measure spaces with product \((X,\mu )\).

Theorem 4

Let \(0< k < n\) and \(m = {\left( {\begin{array}{c}n\\ k\end{array}}\right) }\), and let \(S_1, \ldots , S_m\) enumerate the size-k subsets of \(\left\{ 1, \ldots , n\right\} \). Consider any positive (possibly infinite) exponents \(q_1, \ldots , q_m\) such that \(\sum _{i=1}^m \frac{1}{q_i} \le 1\), and define \(\epsilon = 1 - \sum _{i=1}^m \frac{1}{q_i} \ge 0\). For any nonnegative numbers \(c_1, \ldots , c_m\) such that, for each \(j \in \left\{ 1, \ldots , n\right\} \), \(\sum _{S_i \ni j} c_i = 1\), and any nonnegative \(\mu \)-measurable functions \(f_1, \ldots , f_m\),

$$\begin{aligned} \int _{X} f_1 \cdots f_m d\mu \le \prod _{i=1}^m \left( \int _{S_i} \left( \int _{\sim S_i} f_i^{q_i} \right) ^{p_i/q_i} \right) ^{1/p_i}, \end{aligned}$$

where \(\frac{1}{p_i} = \frac{1}{q_i} + c_i \epsilon \) and \(\int _E\), for \(E \subset \left\{ 1, \ldots , n\right\} \), denotes integration over those \(X_j\) with \(j \in E\).

Remark 7

One possible choice of \(c_1, \ldots , c_m\) is \(c_1 = \cdots = c_m = 1/{\left( {\begin{array}{c}n-1\\ k-1\end{array}}\right) }\). When \(q_1 = \cdots = q_m\) as well, this leads to Proposition 4. (In the typical case \(n < m\), there are many other choices, as then the system \(\sum _{S_i \ni j} c_i = 1\) is underdetermined.) One can instead let k be either 1 or \(n-1\) to obtain Proposition 5.

Proof

For each \(1 \le i \le m\), define

$$\begin{aligned} P_i = \left( \begin{array}{ccc} p_{i,1} &{} \cdots &{} p_{i,n} \\ x_1 &{} \cdots &{} x_n\end{array}\right) \end{aligned}$$

where each \(p_{i,j} = p_i\) if \(j \in S_i\), and \(p_{i,j} = q_i\) otherwise. Clearly, each \(q_i \ge 1\). Because each \(0 \le c_i \le 1\), \(q_i^{-1} \le 1\), and

$$\begin{aligned} \frac{1}{p_i} = \frac{1}{q_i} + c_i\bigg (1 - \sum _{i=1}^m \frac{1}{q_i}\bigg ) \le \frac{1}{q_i} + c_i \bigg (1 - \frac{1}{q_i}\bigg ) = c_i \cdot 1 + (1-c_i) \frac{1}{q_i}, \end{aligned}$$

furthermore \(p_i^{-1} \le 1\), so each \(p_i \ge 1\). To apply Hölder’s inequality, it remains only to prove that \(\sum _{i=1}^m P_i^{-1} = 1\) coordinatewise.

For any \(j \in \left\{ 1, \ldots , n\right\} \),

$$\begin{aligned} \sum _{i=1}^m \frac{1}{p_{i,j}} = \sum _{i=1}^m \frac{1}{q_i} + \sum _{S_i \ni j} c_i \epsilon = 1 - \epsilon + \epsilon \sum _{S_i \ni j} c_i = 1. \end{aligned}$$

Finally, apply Hölder’s inequality with mixed norms \(P_1, \ldots , P_m\) to functions \(f_1, \ldots , f_m\) respectively, followed by Minkowski’s fully-sorted inequality, Corollary 1. (Note that each \(p_i \le q_i\), so the \(q_i\) norm over the variables outside of \(S_i\) comes first.) \(\square \)

What follows is perhaps the simplest case of Theorem 4 which gives a new concrete inequality, not a result of either Propositions 4 or 5. As always, this generalizes to various \(\sigma \)-finite measure spaces or several distinct functions, but this result is given in a simple form.

Proposition 6

Let \(x = x_{i,j,k,l}\) be any quadruply-indexed collection of nonnegative real numbers, where each index takes at most countably many values. Define

$$\begin{aligned} A&= \Big ( \sum _{k,l} \Big ( \sum _{i,j} x^{12} \Big )^{1/4}\Big )^{1/3} \Big (\sum _{i,j} \Big (\sum _{k,l} x^{12} \Big )^{1/4} \Big )^{1/3}, \\ B&= \Big ( \sum _{j,l} \Big ( \sum _{i,k} x^{12} \Big )^{1/3} \Big )^{1/4} \Big ( \sum _{i,k} \Big (\sum _{j,l} x^{12} \Big )^{1/3} \Big )^{1/4}, \\ C&= \Big ( \sum _{j,k} \Big ( \sum _{i,l} x^{12} \Big )^{1/2} \Big )^{1/6} \Big ( \sum _{i,l} \Big (\sum _{j,k} x^{12} \Big )^{1/2} \Big )^{1/6}. \end{aligned}$$

Then

$$\begin{aligned} \sum _{i,j,k,l} x^6 \le ABC. \end{aligned}$$

Proof

For Theorem 4, let \(n=4\) and \(k=2\), so that \(m = 6\). Let \(q_1 = \cdots = q_6 = 12\), so that \(\epsilon = 1 - \sum _{i=1}^6 q_i^{-1} = 1/2\). Enumerate the two-element subsets of \(\left\{ 1, 2, 3, 4\right\} \) by \(S_1 = \left\{ 1, 2\right\} , S_2 = \left\{ 1, 3\right\} , S_3 = \left\{ 1,4\right\} , S_4 = \left\{ 2,3\right\} , S_5 = \left\{ 2,4\right\} ,\) and \(S_6 = \left\{ 3,4\right\} \). Observe that \(1 \in S_1, S_2, S_3\), \(2 \in S_1, S_4, S_5\), \(3 \in S_2, S_4, S_6\), and \(4 \in S_3, S_5, S_6\).

Let \(c_1 = c_6 = 1/2\), \(c_2 = c_5 = 1/3\), and \(c_3 = c_4 = 1/6\), so that

$$\begin{aligned} c_1 + c_2 + c_3 = c_1 + c_4 + c_5 = c_2 + c_4 + c_6 = c_3 + c_5 + c_6 = 1. \end{aligned}$$

Now the result follows from Theorem 4, noting that \(p_1 = p_6 = 3\), \(p_2 = p_5 = 4\), and \(p_3 = p_4 = 6\), and letting each function be x. \(\square \)