Keywords

2010 Mathematics Subject Classification.

1 Introduction: Orthogonal Martingales and the Beurling-Ahlfors Transform

The main result of this note is Theorem 7 below. Of main interest is the array of new Bellman functions, which are very different from the Burkholder’s function.

A complex-valued martingale Y = Y 1 + iY 2 is said to be orthogonal if the quadratic variations of the coordinate martingales are equal and their mutual covariation is 0:

$$\displaystyle{\left <Y _{1}\right> = \left <Y _{2}\right>,\hspace{8.53581pt} \left <Y _{1},Y _{2}\right> = 0.}$$

In [2], Bañuelos and Janakiraman make the observation that the martingale associated with the Beurling-Ahlfors transform is, in fact, an orthogonal martingale. They show that Burkholder’s proof in [9] naturally accommodates for this property and leads to an improvement in the estimate of the Ahlfors–Beurling transform ∥ B ∥  p , which is given by the formula

$$\displaystyle{Bf(z):= \frac{1} {\pi } \int \frac{f(\zeta )} {\zeta -z}dm_{2}(\zeta ).}$$

Theorem 1 (One-Sided Orthogonality)

  1. (i)

    (Left-side orthogonality) Suppose 2 ≤ p < ∞. If Y is an orthogonal martingale and X is any martingale such that \(\left <Y \right> \leq \left <X\right>\) , then

    $$\displaystyle{ \|Y \|_{p} \leq \sqrt{\frac{p^{2 } - p} {2}} \|X\|_{p}. }$$
    (1)
  2. (ii)

    (Right-side orthogonality) Suppose 1 < p < 2. If X is an orthogonal martingale and Y is any martingale such that \(\left <Y \right> \leq \left <X\right>\) , then

    $$\displaystyle{ \|Y \|_{p} \leq \sqrt{ \frac{2} {p^{2} - p}}\|X\|_{p}. }$$
    (2)

It is not known whether these estimates are the best possible.

Remark

The result for left-side orthogonality was proved in [2]. The result for right-side orthogonality was stated in [20]. In [20] we emulate [2] to provide in a rather simple way an estimate on right-side orthogonality and in the regime 1 < p ≤ 2. In the present work we tried to come up with a better constant for this regime, as the sharpness of these constants in [2] and [20] is somewhat dubious. For that purpose we build some family of new (funny and interesting) Bellman functions, very different from the original Burkholder’s function. Even though the approach is quite different from the one in [2] and [20], the constants we obtain here are the same! So, maybe they are sharp after all [1, 3, 4, 68, 1013, 1519, 21]. The Bellman function approach to harmonic analysis problems was used in [2229]. Implicitly it was used in [30] as well. It was extended in [3337].

If X and Y are the martingales associated with f and Bf respectively, then Y is orthogonal, \(\left <Y \right> \leq 4\left <X\right>\), see [2] (and Theorem 5 below), and hence by (1), one obtains

$$\displaystyle{ \|Bf\|_{p} \leq \sqrt{2(\,p^{2 } - p)}\|\,f\|_{p}\text{ for }p \geq 2. }$$
(3)

By interpolating this estimate \(\sqrt{2(\,p^{2 } - p)}\) with the known ∥ B ∥ 2 = 1, Bañuelos and Janakiraman establish the present best estimate in the conjecture by Iwaniec:

$$\displaystyle{ \|B\|_{p} \leq 1.575(\,p^{{\ast}}- 1), }$$
(4)

where \(p^{{\ast}} =\max (\,p, \frac{p} {p-1})\). This is the best to date estimate known for all p. For large p, however, a better estimate is contained in [5]:

$$\displaystyle{ \|B\|_{p} \leq 1.39(\,p - 1),\,\,p \geq 1000. }$$
(5)

The conjecture of Iwaniec states that

$$\displaystyle{ \|B\|_{p} \leq (\,p^{{\ast}}- 1). }$$
(6)

The reader who wants to see the operator theory origins of the problems in this article may consult [27, 32].

2 New Questions and Results

Since B is associated with left-side orthogonality and since we know ∥ B ∥  p  = ∥ B ∥  p, two important questions arise:

  1. (i)

    If 2 ≤ p < , what is the best constant C p in the left-side orthogonality problem: ∥ Y ∥  p  ≤ C p  ∥ X ∥  p , where Y is orthogonal and \(\left <Y \right> \leq \left <X\right>\)?

  2. (ii)

    Similarly, if 1 < p′ < 2, what is the best constant C p in the left-side orthogonality problem?

We have separated the two questions, since Burkholder’s proof (and his function) already gives a good answer, when p ≥ 2. This was the main observation of [2].

However, no estimate (better than p − 1) follows from analyzing Burkholder’s function, when 1 < p′ < 2. Perhaps, we may hope that \(C_{p'} <\sqrt{\frac{p^{2 } -p} {2}}\), when \(1 <p' = \frac{p} {p-1} <2\), which would then imply a better estimate for ∥ B ∥  p . This paper destroys this hope by finding C p; see Theorem 2. We also ask and answer an analogous question of right-side orthogonality when 2 < p < . In the spirit of Burkholder [14], we believe these questions are of independent interest in martingale theory and may have deeper connections with other areas of mathematics.

Remark

The following sharp estimates are proved in [5], they cover the left-side orthogonality for the regime 1 < p ≤ 2 and the right-side orthogonality for the regime 2 ≤ p < . Notice that these two complementary regimes have some non-trivial estimates: 1) for 2 ≤ p <  and left-side orthogonality in [2], 2) for 1 < p ≤ 2, and the right-side orthogonality in this note and in [20], but their sharpness is somewhat dubious.

Theorem 2

Let Y = (Y 1 ,Y 2 ) be an orthogonal martingale and let X = (X 1 ,X 2 ) be an arbitrary martingale.

  1. (i)

    Let 1 < p′ ≤ 2. Suppose \(\left <Y \right> \leq \left <X\right>\) . Then the least constant that always works in the inequality ∥Y ∥ p′ ≤ C p′ ∥X∥ p′ is

    $$\displaystyle{ C_{p'} = \frac{1} {\sqrt{2}} \frac{z_{p'}} {1 - z_{p'}} }$$
    (7)

    where z p′ is the smallest root in (0,1) of the bounded Laguerre function L p′ .

  2. (ii)

    Let 2 ≤ p < ∞. Suppose \(\left <X\right> \leq \left <Y \right>\) . Then the least constant that always works in the inequality ∥X∥ p ≤ C p ∥Y ∥ p is

    $$\displaystyle{ C_{p} = \sqrt{2}\frac{1 - z_{p}} {z_{p}} }$$
    (8)

    where z p is the smallest root in (0,1) of the bounded Laguerre function L p .

Bounded Laguerre function L p is a bounded function that solves the ODE

$$\displaystyle{sL_{p}''(s) + (1 - s)L_{p}'(s) + pL_{p}(s) = 0.}$$

3 Orthogonality

Let Z = (X, Y ), W = (U, V ) be two \(\mathbb{R}^{2}\)-valued martingales on the filtration of 2-dimensional Brownian motion B s  = (B 1s , B 2s ). Let \(A = \left [\begin{array}{*{10}c} -1,& i\\ i, &1 \end{array} \right ]\). We want W to be the martingale transform of Z (defined by matrix A). Let

$$\displaystyle\begin{array}{rcl} & X(t) =\int _{ 0}^{t}\overrightarrow{x}(s) \cdot dB_{s},& {}\\ & Y (t) =\int _{ 0}^{t}\overrightarrow{y}(s) \cdot dB_{s},& {}\\ \end{array}$$

where X, Y are real-valued processes, and \(\overrightarrow{x}(s),\overrightarrow{y}(s)\) are \(\mathbb{R}^{2}\)-valued “martingale differences”.

Put

$$\displaystyle{ Z(t) = X(t) + iY (t),Z(t) =\int _{ 0}^{t}(\overrightarrow{x}(s) + i\overrightarrow{y}(s)) \cdot dB_{ s}, }$$
(9)

and

$$\displaystyle{ W(t) = U(t) + iV (t),W(t) =\int _{ 0}^{t}(A(\overrightarrow{x}(s) + i\overrightarrow{y}(s))) \cdot dB_{ s}. }$$
(10)

We will denote

$$\displaystyle{W = A \star Z.}$$

As before

$$\displaystyle\begin{array}{rcl} & U(t) =\int _{ 0}^{t}\overrightarrow{u}(s) \cdot dB_{s}, & {}\\ & V (t) =\int _{ 0}^{t}\overrightarrow{v}(s) \cdot dB_{s}, & {}\\ & W(t) =\int _{ 0}^{t}(\overrightarrow{u}(s) + i\overrightarrow{v}(s)) \cdot dB_{s}.& {}\\ \end{array}$$

We can easily write components of \(\overrightarrow{u}(s),\overrightarrow{v}(s)\):

$$\displaystyle\begin{array}{rcl} & u_{1}(s) = -x_{1}(s) - y_{2}(s),\,\,v_{1}(s) = x_{2}(s) - y_{1}(s),i = 1,2,& {}\\ & u_{2}(s) = x_{2}(s) - y_{1}(s),\,\,v_{2}(s) = x_{1}(s) + y_{2}(s),i = 1,2. & {}\\ \end{array}$$

Notice that

$$\displaystyle{ \overrightarrow{u} \cdot \overrightarrow{ v} = u_{1}v_{1} + u_{2}v_{2} = -(x_{1} + y_{2})(x_{2} - y_{1}) + (x_{2} - y_{1})(x_{1} + y_{2}) = 0. }$$
(11)

3.1 Local Orthogonality

The processes

$$\displaystyle\begin{array}{rcl} & \langle X,U\rangle (t):=\int _{ 0}^{t}\overrightarrow{x} \cdot \overrightarrow{ u}ds,\,\,\langle X,V \rangle (t):=\int _{ 0}^{t}\overrightarrow{x} \cdot \overrightarrow{ v}ds,& {}\\ & \langle Y,U\rangle (t):=\int _{ 0}^{t}\overrightarrow{y} \cdot \overrightarrow{ u}ds,\,\,\langle Y,V \rangle (t):=\int _{ 0}^{t}\overrightarrow{y} \cdot \overrightarrow{ v}ds, & {}\\ & \langle X,X\rangle (t):=\int _{ 0}^{t}\overrightarrow{x} \cdot \overrightarrow{ x}ds,\,\,\langle Y,Y \rangle (t):=\int _{ 0}^{t}\overrightarrow{y} \cdot \overrightarrow{ y}ds,& {}\\ & \langle X,Y \rangle (t):=\int _{ 0}^{t}\overrightarrow{x} \cdot \overrightarrow{ y}ds,\,\,\langle U,U\rangle (t):=\int _{ 0}^{t}\overrightarrow{u} \cdot \overrightarrow{ u}ds,& {}\\ & \langle V,V \rangle (t):=\int _{ 0}^{t}\overrightarrow{v} \cdot \overrightarrow{ v}ds,\,\,\langle U,V \rangle (t):=\int _{ 0}^{t}\overrightarrow{u} \cdot \overrightarrow{ v}ds. & {}\\ \end{array}$$

are called the covariance processes. We can denote

$$\displaystyle\begin{array}{rcl} & d\langle X,U\rangle (t):=\overrightarrow{ x}(t) \cdot \overrightarrow{ u}(t),\,\,d\langle X,V \rangle (t):=\overrightarrow{ x}(t) \cdot \overrightarrow{ v}(t), & {}\\ & d\langle Y,U\rangle (t):=\overrightarrow{ y}(t) \cdot \overrightarrow{ u}(t),\,\,d\langle Y,V \rangle (t):=\overrightarrow{ y}(t) \cdot \overrightarrow{ v}(t), & {}\\ & d\langle X,X\rangle (t):=\overrightarrow{ x}(t) \cdot \overrightarrow{ x}(t),\,\,d\langle Y,Y \rangle (t):=\overrightarrow{ y}(t) \cdot \overrightarrow{ y}(t), & {}\\ & d\langle X,Y \rangle (t):=\overrightarrow{ x}(t) \cdot \overrightarrow{ y}(t),\,\,d\langle U,U\rangle (t):=\overrightarrow{ u}(t) \cdot \overrightarrow{ u}(t), & {}\\ & d\langle V,V \rangle (t):=\overrightarrow{ v}(t) \cdot \overrightarrow{ v}(t),\,\,d\langle U,V \rangle (t):=\overrightarrow{ u}(t) \cdot \overrightarrow{ v}(t), & {}\\ & d\langle Z,Z\rangle (t):= (\overrightarrow{x}(t) \cdot \overrightarrow{ x}(t) +\overrightarrow{ y}(t) \cdot \overrightarrow{ y}(t)),\,\,d\langle W,W\rangle (t):= (\overrightarrow{u}(t) \cdot \overrightarrow{ u}(t) +\overrightarrow{ v}(t) \cdot \overrightarrow{ v}(t)).& {}\\ \end{array}$$

Of importance is the following observation.

Lemma 3

Let \(A = \left [\begin{array}{*{10}c} -1,& i\\ i, &1 \end{array} \right ]\) . Then

$$\displaystyle{ d\langle U,V \rangle (t) = 0. }$$
(12)

Or

$$\displaystyle{\overrightarrow{u}(t) \cdot \overrightarrow{ v}(t) = 0.}$$

Also we have the following statement.

Lemma 4

With the same A

$$\displaystyle\begin{array}{rcl} & d\langle U,U\rangle (t) \leq 2\,d\langle Z,Z\rangle (t).& {}\\ & d\langle V,V \rangle (t) \leq 2\,d\langle Z,Z\rangle (t).& {}\\ \end{array}$$

Or

$$\displaystyle\begin{array}{rcl} & \overrightarrow{u}(t) \cdot \overrightarrow{ u}(t) \leq 2\,(\overrightarrow{x}(t) \cdot \overrightarrow{ x}(t) +\overrightarrow{ y}(t) \cdot \overrightarrow{ y}(t)),& {}\\ & \overrightarrow{v}(t) \cdot \overrightarrow{ v}(t) \leq 2\,(\overrightarrow{x}(t) \cdot \overrightarrow{ x}(t) +\overrightarrow{ y}(t) \cdot \overrightarrow{ y}(t)).& {}\\ \end{array}$$

Or

$$\displaystyle{ d\langle W,W\rangle (t) \leq 4\,d\langle Z,Z\rangle (t). }$$
(13)

Proof

$$\displaystyle\begin{array}{rcl} & \overrightarrow{u}(t) \cdot \overrightarrow{ u}(t) = (x_{1} + y_{2})^{2} + (x_{2} - y_{1})^{2} = 2\,(x_{1}\,y_{2} - x_{2}\,y_{1})+ & {}\\ & (x_{1})^{2} + (y_{2})^{2} + (x_{2})^{2} + (y_{1})^{2} \leq 2\,((x_{1})^{2} + (y_{2})^{2} + (x_{2})^{2} + (y_{1})^{2}) = 2\,d\langle Z,Z\rangle.& {}\\ \end{array}$$

The same can be shown for v. □ 

Definition

The complex martingale W = AZ is called the Ahlfors-Beurling transform of martingale Z.

Now let us quote again the theorem of Bañuelos–Janakiraman from [2]:

Theorem 5

Let Z,W be any two martingales on the filtration of Brownian motion, such that W is an orthogonal martingale in the sense of  (12) : dU,V= 0, and such that there is a subordination property

$$\displaystyle{ d\langle W,W\rangle \leq d\langle Z,Z\rangle }$$
(14)

Let p ≥ 2. Then

$$\displaystyle{ (\mathbf{E\,}\vert W\vert ^{p})^{1/p} \leq \sqrt{\frac{p^{2 } - p} {2}} (\mathbf{E\,}\vert Z\vert ^{p})^{1/p}. }$$
(15)

Further we will use the notation

$$\displaystyle{\|Z\|_{p}:= (\mathbf{E\,}\vert Z\vert ^{p})^{1/p}.}$$

Applied to our case (with the help of Lemmas 34) we get the following theorem from Theorem 5.

Theorem 6

\(\|W\|_{p} =\| A \star Z\|_{p} \leq \sqrt{2(\,p^{2 } - p)}\|Z\|_{p},\,\,\forall p \geq 2.\)

4 Subordination by Orthogonal Martingales in L 3∕2

For 1 < p ≤ 2 one has the following

Theorem 7

Let Z,W be any two \(\mathbb{R}^{2}\) martingales as above, and let W be an orthogonal martingale in the sense that:

$$\displaystyle{d\langle U,V \rangle = 0.}$$

Let us also assume

$$\displaystyle{ d\langle U,U\rangle = d\langle V,V \rangle. }$$
(16)

Let Z be subordinated to the orthogonal martingale W:

$$\displaystyle{ d\langle Z,Z\rangle \leq \langle W,W\rangle }$$
(17)

Then for 1 < q ≤ 2

$$\displaystyle{ \|Z\|_{q} \leq \sqrt{ \frac{2} {q^{2} - q}}\|W\|_{q}. }$$
(18)

Below we will give the proof for all q ∈ (1, 2], but first we will give the proof only for q = 3∕2. Moreover, our general-case proof may indicate that the constant \(\sqrt{ \frac{ 2} {p^{2}-p}}\) is sharp after all. (Note that a completely different proof, but with the same constant, is given in [20].)

Proof

We assume that \(F = (\Phi,\Psi )\) (or \(F = \Phi + i\Psi\)) is a martingale on the filtration of Brownian motion

$$\displaystyle\begin{array}{rcl} & \Phi (t) =\int _{ 0}^{t}\overrightarrow{\phi }(s) \cdot dB_{s},\,\,\Psi (t) =\int _{ 0}^{t}\overrightarrow{\psi }(s) \cdot dB_{s}, & {}\\ & X(t) =\int _{ 0}^{t}\overrightarrow{x}(s) \cdot dB_{s},\,\,Y (t) =\int _{ 0}^{t}\overrightarrow{y}(s) \cdot dB_{s},& {}\\ & U(t) =\int _{ 0}^{t}\overrightarrow{u}(s) \cdot dB_{s},\,\,V (t) =\int _{ 0}^{t}\overrightarrow{v}(s) \cdot dB_{s},& {}\\ \end{array}$$

and that these vector processes and their components satisfy Lemmas 3 and 4, namely:

$$\displaystyle\begin{array}{rcl} u_{1}v_{1} + u_{2}v_{2} = 0,& &{}\end{array}$$
(19)
$$\displaystyle\begin{array}{rcl} (u_{1})^{2} + (u_{ 2})^{2} = (v_{ 1})^{2} + (v_{ 2})^{2},& &{}\end{array}$$
(20)
$$\displaystyle{\mathfrak{I}\mathbf{E\,}(F \cdot Z) =\int _{ 0}^{t}(d\langle \Phi,X\rangle + d\langle \Psi,Y \rangle )\,ds =\int _{ 0}^{t}(\phi _{ 1}x_{1} +\phi _{2}x_{2} +\psi _{1}y_{1} +\psi _{2}y_{2})\,ds.}$$

Hence,

$$\displaystyle{ \vert \mathfrak{I}\mathbf{E\,}(Z\cdot F)\vert \leq \int _{0}^{t}((\phi _{ 1})^{2}+(\phi _{ 2})^{2}+(\psi _{ 1})^{2}+(\psi _{ 2})^{2})^{1/2}((x_{ 1})^{2}+(x_{ 2})^{2}+(y_{ 1})^{2}+(y_{ 2})^{2})^{1/2}\,ds. }$$
(21)

By subordination assumption (17) we have

$$\displaystyle{ \vert \mathfrak{I}\mathbf{E\,}(Z\cdot F)\vert \leq \int _{0}^{t}((u_{ 1})^{2}+(u_{ 2})^{2}+(v_{ 1})^{2}+(v_{ 2})^{2})^{1/2}((\phi _{ 1})^{2}+(\phi _{ 2})^{2}+(\psi _{ 1})^{2}+(\psi _{ 2})^{2})^{1/2}\,ds. }$$
(22)

Our next goal is to prove that

$$\displaystyle\begin{array}{rcl} & \sqrt{\frac{3} {2}}\int _{0}^{t}((u_{ 1})^{2} + (u_{ 2})^{2} + (v_{ 1})^{2} + (v_{ 2})^{2})^{1/2}((\phi _{ 1})^{2} + (\phi _{ 2})^{2} + (\psi _{ 1})^{2} + (\psi _{ 2})^{2})^{1/2} \leq & \\ & 2\bigg(\frac{\|W\|_{3/2}^{3/2}} {3/2} + \frac{\|F\|_{3}^{3}} {3} \bigg). &{}\end{array}$$
(23)

Let us polarize the last equation to convert its RHS to 2 ∥ W ∥ 3∕2 ∥ F ∥ 3. Then let us use the combination of (22) and (23). Then we obtain the desired estimate

$$\displaystyle{ \|Z\|_{3/2} \leq \frac{2\sqrt{2}} {\sqrt{3}} \|W\|_{3/2}, }$$
(24)

which is equivalent to the claim of Theorem 7 for q = 3∕2.

We are left to prove (23). For that we will need next several sections. □ 

5 Bellman Functions and Martingales, the Proof of (23)

Suppose we have the function of 4 real variables such that

$$\displaystyle\begin{array}{rcl} B(y_{11},y_{12},y_{21},y_{22}) \leq \frac{2} {3}(y_{11}^{2} + y_{ 12}^{2})^{3/2} + \frac{4} {3}(y_{21}^{2} + y_{ 22}^{2})^{1/2},& &{}\end{array}$$
(25)
$$\displaystyle\begin{array}{rcl} \langle d^{2}B(y_{ 11},y_{12},y_{21},y_{22})\left [\begin{array}{*{10}c} dy_{11} \\ dy_{12} \\ dy_{21} \\ dy_{22} \end{array} \right ],\left [\begin{array}{*{10}c} dy_{11} \\ dy_{12} \\ dy_{21} \\ dy_{22} \end{array} \right ]\rangle \geq & &{}\end{array}$$
(26)
$$\displaystyle\begin{array}{rcl} & & \tau (dy_{11}^{2} + dy_{ 12}^{2}) + \frac{1} {\tau } (dy_{21}^{2} + dy_{ 22}^{2}) + \frac{3\tau } {4x_{2}}\Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & & + \frac{\tau x_{1}} {\sqrt{x_{1 }^{2 } + 3x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2}, {}\\ \end{array}$$

where

$$\displaystyle{ \frac{3} {4} \frac{\tau } {(y_{21}^{2} + y_{22}^{2})^{1/2}} + \frac{2} {\tau } \geq \frac{3} {\tau }. }$$
(27)

Then we can prove (23). Let us start by writing Itô’s formula for the process \(b(t):= B(\Phi (t),\Psi (t),U(t),V (t))\):

$$\displaystyle{db =\langle \nabla B(\Phi,..,V ),(d\Phi (t),\ldots,dV (t)\rangle + \frac{1} {2}(d^{2}B(\phi _{ 1},\psi _{1},u_{1},v_{1}) + d^{2}B(\phi _{ 2},\psi _{2},u_{2},v_{2})).}$$

Here d 2 B stands for the Hessian bilinear form. It is applied to vector (ϕ 1, ψ 1, u 1, v 1) and then to vector (ϕ 2, ψ 2, u 2, v 2). Of course, the second derivatives of B constituting this form are calculated at point \((\Phi,\Psi,U,V )\). All this is at time t. The first term is a martingale with zero average, and it disappears after taking the expectation.

Therefore,

$$\displaystyle\begin{array}{rcl} & \mathbf{E\,}(b(t) - b(0)) = \mathbf{E\,}\int _{0}^{t}db(s)\,ds = & \\ & \frac{1} {2}\int _{0}^{t}((d^{2}B(\phi _{ 1},\psi _{1},u_{1},v_{1}) + d^{2}B(\phi _{ 2},\psi _{2},u_{2},v_{2}))\,ds =: \frac{1} {2}\int _{0}^{t}dI.&{}\end{array}$$
(28)

The sum in (28) is the Hessian bilinear form on vector (ϕ 1, ψ 1, u 1, v 1) plus the Hessian bilinear form on vector (ϕ 2, ψ 2, u 2, v 2). Using (26) we can add these two forms with a definite cancellation:

$$\displaystyle\begin{array}{rcl} & & dI =\tau ((\phi _{1})^{2} + (\psi _{ 1})^{2}) + 1/\tau ((u_{ 1})^{2} + (v_{ 1})^{2}) + {}\\ & & \frac{3} {4} \frac{\tau } {(U^{2} + V ^{2})^{1/2}} \frac{V ^{2}(u_{1})^{2} + U^{2}(v_{1})^{2} - 2UV u_{1}v_{1}} {U^{2} + V ^{2}} +\, \text{Positive term} + {}\\ & & \tau ((\phi _{2})^{2} + (\psi _{ 2})^{2}) + 1/\tau ((u_{ 2})^{2} + (v_{ 2})^{2}) + {}\\ & & \frac{3} {4} \frac{\tau } {(U^{2} + V ^{2})^{1/2}} \frac{V ^{2}(u_{2})^{2} + U^{2}(v_{2})^{2} - 2UV u_{2}v_{2}} {U^{2} + V ^{2}} +\, \text{Positive term}. {}\\ \end{array}$$

Notice that orthogonality (19) and equality of norms (20):

$$\displaystyle\begin{array}{rcl} d\langle U,V \rangle = 0,& &{}\end{array}$$
(29)
$$\displaystyle\begin{array}{rcl} d\langle U,U\rangle = d\langle V,V \rangle,& &{}\end{array}$$
(30)

imply pointwise equalities u 1 v 1 + u 2 v 2 = 0 and thus

$$\displaystyle{V ^{2}(u_{ 1})^{2}+U^{2}(v_{ 1})^{2}+V ^{2}(u_{ 2})^{2}+U^{2}(v_{ 2})^{2} = \frac{1} {2}(U^{2}+V ^{2})((u_{ 1})^{2}+(u_{ 2})^{2}+(v_{ 1})^{2}+(v_{ 2})^{2}).}$$

Therefore, UV –term above will disappear, and we will get

$$\displaystyle\begin{array}{rcl} & & dI =\tau ((\phi _{1})^{2} + (\psi _{ 1})^{2} + (\phi _{ 2})^{2} + (\psi _{ 2})^{2}) + 1/\tau ((u_{ 1})^{2} + (v_{ 1})^{2} + (u_{ 2})^{2} + (v_{ 2})^{2})) + {}\\ & & \frac{3} {4} \frac{\tau } {(U^{2} + V ^{2})^{3/2}} \cdot \frac{1} {2}(U^{2} + V ^{2})((u_{ 1})^{2} + (u_{ 2})^{2} + (v_{ 1})^{2} + (v_{ 2})^{2}) +\, \text{Positive} = {}\\ & & \tau ((\phi _{1})^{2} + (\psi _{ 1})^{2} + (\phi _{ 2})^{2} + (\psi _{ 2})^{2}) + {}\\ & & \frac{1} {2}\bigg(\frac{3} {4} \frac{\tau } {(U^{2} + V ^{2})^{1/2}} + \frac{2} {\tau } \bigg)((u_{1})^{2} + (v_{ 1})^{2} + (u_{ 2})^{2} + (v_{ 2})^{2})) +\, \text{Positive}. {}\\ \end{array}$$

Hence, by using (27) we get

$$\displaystyle\begin{array}{rcl} & dI \geq \tau (\|\overrightarrow{\phi }\|^{2} +\|\overrightarrow{ \psi }\|^{2}) + \frac{3} {2} \cdot \frac{1} {\tau } (\|\overrightarrow{u}\|^{2} +\|\overrightarrow{ v}\|^{2}).& \\ & \geq 2\sqrt{\frac{3} {2}}(\|\overrightarrow{\phi }\|^{2} +\|\overrightarrow{ \psi }\|^{2})^{1/2}(\|\overrightarrow{u}\|^{2} +\|\overrightarrow{ v}\|^{2})^{1/2}.&{}\end{array}$$
(31)

Let us combine now (28) and (31). We get

$$\displaystyle{ \sqrt{\frac{3} {2}}\int _{0}^{t}(\|\overrightarrow{\phi }\|^{2} +\|\overrightarrow{ \psi }\|^{2})^{1/2}(\|\overrightarrow{u}\|^{2} +\|\overrightarrow{ v}\|^{2})^{1/2}\,ds \leq \frac{1} {2}dI \leq \mathbf{E\,}(b(t)). }$$
(32)

We used (25) that claims b ≥ 0. But it also claims that

$$\displaystyle{ b(t) = B(\Phi (t),\Psi (t),U(t),V (t)) \leq 2\,\bigg(\frac{\vert (U,V )\vert ^{3/2}} {3/2} + \frac{\vert (\Phi,\Psi )\vert ^{3}} {3} \bigg). }$$
(33)

Combine (32) and (33). We obtain (23).

To find the function with (25) and (26) we need the next section.

6 Special Function \(B = \frac{2} {9}(y_{11}^{2} + y_{ 12}^{2}) + 3(y_{ 21}^{2} + y_{ 22}^{2})^{1/2})^{3/2} + \frac{2} {9}((y_{11}^{2} + y_{ 12}^{2}))^{3/2}\)

It is useful if the reader thinks that y 11, y 12, y 21, y 22 are correspondingly \(\Phi,\Psi,U,V\).

Also in what follows dy 11, dy 12, dy 21, dy 22 can be viewed as ϕ 1, ψ 1, u 1, v 1 and ϕ 2, ψ 2, u 2, v 2.

Let B n+m (x) be a real-valued function of n + m variables x = (x 1, , x n , x n+1, , x n+m ). Define a function B nk+m (y) of n vector-valued variables y i  = (y i1, , y ik ), 1 ≤ i ≤ n, and m scalar variables y i , n + 1 ≤ i ≤ n + m, as follows:

$$\displaystyle{\mathbf{B}_{nk+m}(y) = \mathbf{B}_{n+m}(x),}$$

where

$$\displaystyle\begin{array}{rcl} x_{i}& =& \|y_{i}\|:=\Big (\sum _{j=1}^{k}y_{ ij}^{2}\Big)^{\frac{1} {2} }\qquad \text{for }i \leq n, {}\\ x_{i}& =& y_{i}\qquad \text{for }i> n. {}\\ \end{array}$$

Omitting indices we shall denote by \(\frac{d^{2}\mathbf{B}} {dx^{2}}\) and \(\frac{d^{2}\mathbf{B}} {dy^{2}}\) the Hessian matrices of B n+m (x) and B nk+m (x), respectively.

7 Hessian of a Vector-Valued Function

Lemma 8

Let P j be the following operator from \(\mathbb{R}^{k}\) to \(\mathbb{R}\) :

$$\displaystyle{P_{j}h = \frac{(h,y_{j})} {x_{j}},}$$

i.e., it gives the projection to the direction y j . Let P be the block-diagonal operator from \(\mathbb{R}^{kn+m} = \mathbb{R}^{k} \oplus \mathbb{R}^{k} \oplus \ldots \oplus \mathbb{R}^{k} \oplus \mathbb{R} \oplus \ldots \oplus \mathbb{R}\) to \(\mathbb{R}^{n+m} = \mathbb{R} \oplus \mathbb{R} \oplus \ldots \oplus \mathbb{R} \oplus \mathbb{R} \oplus \ldots \oplus \mathbb{R}\) whose first n diagonal elements are P j and the rest is identity. Then

$$\displaystyle{\frac{d^{2}\mathbf{B}} {dy^{2}} = P^{{\ast}}\frac{d^{2}\mathbf{B}} {dx^{2}} P + \text{diag}\,\left \{(I - P_{i}^{{\ast}}P_{ i}) \frac{1} {x_{i}} \frac{\partial \mathbf{B}} {\partial x_{i}}\right \},}$$

or

$$\displaystyle\begin{array}{rcl} d^{2}\mathbf{B}& =& \sum _{ i,j=1}^{n} \frac{\partial ^{2}\mathbf{B}} {\partial x_{i}\partial x_{j}} \cdot \frac{\sum _{s=1}^{k}y_{is}dy_{is}} {x_{i}} \cdot \frac{\sum _{r=1}^{k}y_{jr}dy_{jr}} {x_{j}} {}\\ & +& 2\sum _{i=1}^{n}\sum _{ j=n+1}^{n+m} \frac{\partial ^{2}\mathbf{B}} {\partial x_{i}\partial x_{j}} \cdot \frac{\sum _{s=1}^{k}y_{is}dy_{is}} {x_{i}} \cdot dy_{j} {}\\ & +& \sum _{i=n+1}^{n+m}\sum _{ j=n+1}^{n+m} \frac{\partial ^{2}\mathbf{B}} {\partial x_{i}\partial x_{j}} \cdot dy_{i} \cdot dy_{j} {}\\ & +& \sum _{i=1}^{n} \frac{1} {x_{i}} \frac{\partial \mathbf{B}} {\partial x_{i}} \cdot \left (\sum _{j=i}^{k}dy_{ ij}^{2} -\Big (\frac{\sum _{j=1}^{k}y_{ ij}dy_{ij}} {x_{i}} \Big)^{2}\right ). {}\\ \end{array}$$

7.1 Positive Definite Quadratic Forms

Let

$$\displaystyle{ Q = Ax^{2} + 2Bxy + Cy^{2} }$$

be a positive definite quadratic form. We are interested in the best possible constant D such that

$$\displaystyle{ Q \geq 2D\vert x\vert \,\vert y\vert \qquad \text{for all }x,y \in \mathbb{R}. }$$

After dividing this inequality over | x | | y | we get

$$\displaystyle{At \pm 2B + \frac{C} {t} \geq 2D\qquad \text{for all }t \in \mathbb{R}\setminus \{0\}.}$$

The left-hand side has its minimum at the point \(t = \sqrt{\frac{C} {A}}\). Therefore the best D is \(\sqrt{AC} -\vert B\vert\).

Now we would like to present Q as a sum of three squares:

$$\displaystyle{Q = D(\tau x^{2} + \frac{1} {\tau } y^{2}) + (\alpha x +\beta y)^{2},}$$

which would immediately imply the required estimate. We think that

$$\displaystyle{(A - D\tau )x^{2} + 2Bxy + (C -\frac{D} {\tau } )y^{2}}$$

is a complete square, whence

$$\displaystyle{(A - D\tau )(C -\frac{D} {\tau } ) = B^{2}}$$

or

$$\displaystyle\begin{array}{rcl} & CD\tau ^{2} - (AC - B^{2} + D^{2})\tau + AD = 0,& {}\\ & C\tau ^{2} - 2\sqrt{AC}\tau + A = 0. & {}\\ \end{array}$$

Therefore, \(\tau = \sqrt{\frac{A} {C}}\) and

$$\displaystyle{ Q = (\sqrt{AC} -\vert B\vert )\Big(\sqrt{\frac{A} {C}}x^{2} + \sqrt{\frac{C} {A}}y^{2}\Big) + \vert B\vert \sqrt{\frac{A} {C}}\Big(x +\mathop{ \mathrm{sign}}\nolimits B\sqrt{\frac{C} {A}}y\Big)^{2} }$$
(34)

7.2 Example

Let

$$\displaystyle{ \mathbf{B}_{2}(x) = \frac{2} {9}(x_{1}^{2} + 3x_{ 2})^{3/2} + \frac{2} {9}x_{1}^{3}, }$$
(35)
$$\displaystyle{\mathbf{B}_{4}(y) = B_{2}(x);\qquad x_{i} = \sqrt{y_{i1 }^{2 } + y_{i2 }^{2}}.}$$

Calculate the derivatives:

$$\displaystyle\begin{array}{rcl} & \frac{\partial \mathbf{B}_{2}} {\partial x_{1}} = \frac{2} {3}x_{1}(\sqrt{x_{1 }^{2 } + 3x_{2}} + x_{1}), \frac{\partial \mathbf{B}_{2}} {\partial x_{2}} = \sqrt{x_{1 }^{2 } + 3x_{2}},& {}\\ & A = \frac{\partial ^{2}\mathbf{B}_{ 2}} {\partial x_{1}^{2}} = \frac{2(\sqrt{x_{1 }^{2 }+3x_{2}}+x_{1})^{2}} {3\sqrt{x_{1 }^{2 }+3x_{2}}}, & {}\\ & B = \frac{\partial ^{2}\mathbf{B}_{ 2}} {\partial x_{1}\partial x_{2}} = \frac{x_{1}} {\sqrt{x_{1 }^{2 }+3x_{2}}},C = \frac{\partial ^{2}\mathbf{B}_{ 2}} {\partial x_{2}^{2}} = \frac{3} {2\sqrt{x_{1 }^{2 }+3x_{2}}}, & {}\\ & D = \sqrt{AC} -\vert B\vert = 1, & {}\\ \end{array}$$

Also

$$\displaystyle\begin{array}{rcl} \tau = \sqrt{\frac{A} {C}} = \frac{2} {3}(\sqrt{x_{1 }^{2 } + 3x_{2}} + x_{1}),& &{}\end{array}$$
(36)
$$\displaystyle\begin{array}{rcl} \frac{1} {\tau } = \frac{\sqrt{x_{1 }^{2 } + 3x_{2}} - x_{1}} {2x_{2}}.& &{}\end{array}$$
(37)

After substitution in the expressions of the preceding sections we get

$$\displaystyle\begin{array}{rcl} d^{2}\mathbf{B}_{ 4}& =& \tau \Big(\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} \Big)^{2} + \frac{1} {\tau } \Big(\frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big) {}\\ & +& \frac{\tau x_{1}} {\sqrt{x_{1 }^{2 } + 3x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2} {}\\ & +& \frac{2} {3}\Big(\sqrt{x_{1 }^{2 } + 3x_{2}} + x_{1}\Big)\Big(\frac{y_{12}dy_{11} - y_{11}dy_{12}} {x_{1}} \Big)^{2} {}\\ & +& \frac{\sqrt{x_{1 }^{2 } + 3x_{2}}} {x_{2}} \Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & =& \tau (dy_{11}^{2} + dy_{ 12}^{2}) + \frac{1} {\tau } (dy_{21}^{2} + dy_{ 22}^{2}) + \frac{3\tau } {4x_{2}}\Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & +& \frac{\tau x_{1}} {\sqrt{x_{1 }^{2 } + 3x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2}. {}\\ \end{array}$$

7.3 Verifying (27)

Here, using (36), (37) we get

$$\displaystyle{ \tau = \frac{2} {3}((y_{11}^{2} + y_{ 12}^{2}) + 3(y_{ 21}^{2} + y_{ 22}^{2})^{1/2})^{1/2} + (y_{ 11}^{2} + y_{ 12}^{2})^{1/2}). }$$
(38)

And henceforth

$$\displaystyle{ \frac{1} {\tau } = \frac{((y_{11}^{2} + y_{12}^{2}) + 3(y_{21}^{2} + y_{22}^{2})^{1/2})^{1/2} - (y_{11}^{2} + y_{12}^{2})^{1/2}} {2(y_{21}^{2} + y_{22}^{2})^{1/2}}. }$$
(39)

Let us now (when we know τ) check the condition (27):

$$\displaystyle\begin{array}{rcl} & \frac{3} {4} \frac{\tau } {(y_{21}^{2}+y_{22}^{2})^{1/2}} + \frac{2} {\tau } = \frac{1} {2} \frac{(y_{11}^{2}+y_{ 12}^{2})+3(y_{ 21}^{2}+y_{ 22}^{2})^{1/2})^{1/2}+(y_{ 11}^{2}+y_{ 12}^{2})^{1/2}} {(y_{21}^{2}+y_{22}^{2})^{1/2}} +& {}\\ & \frac{(y_{11}^{2}+y_{ 12}^{2})+3(y_{ 21}^{2}+y_{ 22}^{2})^{1/2})^{1/2}-(y_{ 11}^{2}+y_{ 12}^{2})^{1/2}} {(y_{21}^{2}+y_{22}^{2})^{1/2}} = & {}\\ & \frac{3(y_{11}^{2}+y_{ 12}^{2})+3(y_{ 21}^{2}+y_{ 22}^{2})^{1/2})^{1/2}-(y_{ 11}^{2}+y_{ 12}^{2})^{1/2}} {2(y_{21}^{2}+y_{22}^{2})^{1/2}} \geq \frac{3} {\tau }. & {}\\ \end{array}$$

So, yes, we finished the proof of the fact that function

$$\displaystyle{B = \frac{2} {9}(y_{11}^{2} + y_{ 12}^{2}) + 3(y_{ 21}^{2} + y_{ 22}^{2})^{1/2})^{3/2} + \frac{2} {9}((y_{11}^{2} + y_{ 12}^{2}))^{3/2}}$$

satisfies all differential properties we wished, and thus it is proving our main result for \(q = \frac{3} {2}\). In fact, we saw that it proves (23). In its turn we saw that (23) implies (24), which is the same as proving Theorem 7 for q = 3∕2.

We are very lucky that B is found in the explicit form. There are only two such exponents, for which explicit form exists: \(q = \frac{3} {2}\) and q = 2.

8 Explanation of How We Found This Special Function B: Pogorelov’s Theorem

We owe the reader an explanation of where we got this function B, which played such a prominent part above.

Let p ≥ 2. We want to find a function satisfying the following properties:

  • 1) B is defined in the whole plane \(\mathbb{R}^{2}\) and B(u, v) = B(−u, v) = B(u, −v);

  • 2) \(0 \leq B(u,v) \leq (\,p - 1)(\frac{1} {p}\vert u\vert ^{p} + \frac{1} {q}\vert v\vert ^{q})\);

  • 3) Everywhere we have inequality for Hessian quadratic form d 2 B(u, v) ≥ 2 | du | | dv | ;

  • 4) Homogeneity: B(c 1∕p u, c 1∕q v) = cB(u, v), c > 0;

  • 5) Function B should be the “best” one satisfying 1), 2), 3).

We understand the last statement as follows: B must saturate inequalities to make them equalities on a natural subset of \(\mathbb{R}^{2}\) in 2) and on a natural subset of the tangent bundle of \(\mathbb{R}^{2}\) in 3).

Let us start with 3). This inequality just means that d 2 B(u, v) ≥ 2dudv, d 2 B(u, v) ≥ −2dudv for any \((u,v) \in \mathbb{R}^{2}\) and for any \((du,dv) \in \mathbb{R}^{2}\). In other words, this is just positivity of matrices

$$\displaystyle{ \left [\begin{array}{*{10}c} B_{uu}, &B_{uv} - 1 \\ B_{vu} - 1,& B_{vv} \end{array} \right ] \geq 0,\,\,\left [\begin{array}{*{10}c} B_{uu}, &B_{uv} + 1 \\ B_{vu} + 1,& B_{vv} \end{array} \right ] \geq 0. }$$
(40)

Now we want (40) to barely occur. In other words, we want one of the matrices in (40) to have a zero determinant for every (u, v).

Notice that symmetry 1) allows us to consider B only in the first quadrant. Here we will assume the second matrix in (40) to have zero determinant in the first quadrant.

So let us assume for u > 0, v > 0

$$\displaystyle{ 5)\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\det \left [\begin{array}{*{10}c} B_{uu}, &B_{uv} + 1 \\ B_{vu} + 1,& B_{vv} \end{array} \right ] = 0. }$$
(41)

Let us introduce

$$\displaystyle{A(u,v):= B(u,v) + uv.}$$

So, we require that

$$\displaystyle{ \det \left [\begin{array}{*{10}c} A_{uu},&A_{uv} \\ A_{vu},&A_{vv} \end{array} \right ] = 0. }$$
(42)

Returning to saturation of 2): we require that \(B(u,v) =\phi (u,v):= (\,p - 1)(\frac{1} {p}u^{p} + \frac{1} {q}v^{q})\) at a non-zero point. By homogeneity 4) we have this equality on the whole curve \(\Gamma\), where \(\Gamma\) is invariant under transformations u → c 1∕p u, v → c 1∕q v.

$$\displaystyle{ B(u,v) =\phi (u,v):= (\,p - 1)(\frac{1} {p}u^{p} + \frac{1} {q}v^{q})\,\,\,\text{on the curve}\,\,\,v^{q} =\gamma u^{p}. }$$
(43)

Notice that γ is unknown at this moment. We are going to solve (42) and (43), so that our solution satisfies (40), 1), 2), 3), 4).

Remark

We strongly suspect that such solution is still non-unique. On the other hand, one cannot “improve” 1), 2), 3), 4) by, say, changing 2 in 3) to a bigger constant, or making a constant p − 1 in 2) smaller.

Recall that we have also the symmetry conditions on A(u, v) + uv = : B(u, v). They are

$$\displaystyle{B(-u,v) = B(u,v),\,\,B(u,-v) = B(u,v).}$$

We assume the smoothness of B. It is a little bit ad hoc assumption, and we will be using it as such, namely, we will assume it when it is convenient and we will be on guard not to come to a contradiction. Anyway, assuming now the smoothness of B on the v-axis we get that the symmetry implies the Neumann boundary condition on B on v-axis: \(\frac{\partial } {\partial u}B(0,v) = 0\), that is

$$\displaystyle{ \frac{\partial } {\partial u}A(0,v) = v. }$$
(44)

Solving the homogeneous Monge-Ampère equation is the same as building a surface of zero gaussian curvature. We base the following on a Theorem of Pogorelov [31]. The reader can see the algorithm in [36]. So we will be brief. Solution A must have the form

$$\displaystyle{ A(u,v) = t_{1} \cdot u + t_{2} \cdot v - t, }$$
(45)

where t 1: = A u (u, v), t 2: = A v (u, v), t(u, v) are unknown function of u, v, but, say, t 1, t 2 are certain functions of t. Moreover, Pogorelov’s theorem says that

$$\displaystyle{ u \cdot dt_{1} + v \cdot dt_{2} - dt = 0,\,\,\text{meaning}\,\,u \cdot \frac{dt_{1}} {dt} + v \cdot \frac{dt_{2}} {dt} - 1 = 0. }$$
(46)

We write homogeneity condition 4) as follows A(c 1∕p u, c 1∕q v) = cA(u, v), differentiate in c and plug c = 1. Then we obtain

$$\displaystyle{ A(u,v) = \frac{1} {p}t_{1} \cdot u + \frac{1} {q}t_{2} \cdot v, }$$
(47)

which being combined with (45) gives

$$\displaystyle{ \frac{1} {q}t_{1} \cdot u + \frac{1} {p}t_{2} \cdot v - t = 0. }$$
(48)

Notice a simple thing, when t is fixed (46) gives us the equation of a line in (u, v) plane. Call this line L t . Functions t 1, t 2 are certain (unknown at this moment) functions of t, so again, for a fixed t equation (48) also gives us a line. Of course this must be L t . Comparing the coefficients we obtain differential equations on t 1, t 2:

$$\displaystyle{ q\frac{dt_{1}} {t_{1}} = \frac{dt} {t},\,p\frac{dt_{2}} {t_{2}} = \frac{dt} {t}. }$$
(49)

We write immediately the solutions in the following form:

$$\displaystyle{ t_{1}(t) = pC_{1}\vert t\vert ^{\frac{1} {q} },\,t_{2}(t) = qC_{2}\vert t\vert ^{\frac{1} {p} }. }$$
(50)

Plugging this into (47) one gets

$$\displaystyle{ A(u,v) = C_{1}t^{\frac{1} {q} }u + C_{2}t^{\frac{1} {p} }v,\,\,B(u,v) = C_{1}t^{\frac{1} {q} }u + C_{2}t^{\frac{1} {p} }v - uv, }$$
(51)

where t(u, v) (see (48)) is defined from the following implicit formula

$$\displaystyle{ t = \frac{p} {q}C_{1}t^{\frac{1} {q} }u + \frac{q} {p}C_{2}t^{\frac{1} {p} }v. }$$
(52)

To define unknown constants C 1, C 2 we have only one boundary condition (44). However we have one more condition. It is a free boundary condition (we think that p ≥ 2 ≥ q)

$$\displaystyle{ B(u,v) =\phi (u,v):= (\,p - 1)(\frac{1} {p}u^{p} + \frac{1} {q}v^{q})\,\,\text{on the curve}\,\,\Gamma:=\{ v^{q} =\gamma ^{q}u^{p}\}. }$$
(53)

This seems to be not saving us because we have three unknowns C 1, C 2, γ and two conditions: (44) and (53). But we will require in addition that B(u, v) and ϕ(u, v) have the same tangent plane on the curve \(\Gamma\):

$$\displaystyle{ \frac{B_{u}(u,v)} {B_{v}(u,v)} = \frac{\phi _{u}(u,v)} {\phi _{v}(u,v)}\,\,\text{on the curve}\,\,\Gamma =\{ v^{q} =\gamma ^{q}u^{p}\}. }$$
(54)

Now we are going to solve (44), (53), (54), to find C 1, C 2, γ and plug them into (51) and (52).

First of all

$$\displaystyle{v = A_{u}(0,v) = t_{1}(0,v).}$$

So \(v/pC_{1} = t(0,v)^{\frac{1} {q} }\) from (50). Plug u = 0 into (52) to get \(t(0,v)^{\frac{1} {q} } = \frac{q} {p}C_{2}v\). Combining we get

$$\displaystyle{C_{1}C_{2} = \frac{1} {q}.}$$

Now we use (54).

$$\displaystyle{\frac{t_{1} - v} {t_{2} - u} = \frac{u^{p-1}} {v^{q-1}} = \frac{u^{p}} {v^{q}} \frac{v} {u} = \frac{1} {\gamma ^{q}} \frac{v} {u}.}$$

Using (50) we get

$$\displaystyle{ \frac{pC_{1}t^{\frac{1} {q} } - v} {qC_{2}t^{\frac{1} {p} } - u} = \frac{1} {\gamma ^{q}} \frac{v} {u}. }$$
(55)

Let us write \(\Gamma\) as \(u^{p} = \frac{1} {\gamma } uv\) or v q = γ q−1 uv, and let us write on \(\Gamma\)

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} t^{\frac{1} {q} } = av\quad \\ t^{\frac{1} {p} } = bu\quad \end{array} \right. }$$
(56)

The reader will easily see from what follows that a, b are constants. From (55)

$$\displaystyle{ (\,pC_{1}a - 1)\gamma ^{q}uv = (qC_{ 2}b - 1)uv. }$$
(57)

Also from (56)

$$\displaystyle{ \frac{a^{q}} {b^{p}} = \frac{1} {\gamma ^{q}}, }$$
(58)

and from (56) and (52)

$$\displaystyle{ ab = \frac{p} {q}C_{1}a + \frac{q} {p}C_{2}b. }$$
(59)

From (56), (53) it follows

$$\displaystyle{ C_{1}a + C_{2}b - 1 = (\,p - 1)(\frac{1} {p} \cdot \frac{1} {\gamma } + \frac{1} {q} \cdot \gamma ^{q-1}). }$$
(60)

We already proved

$$\displaystyle{ C_{1}C_{2} = \frac{1} {q}. }$$
(61)

We have five equations (55)–(61) on five unknowns C 1, C 2, a, b, γ.

One solution is obvious:

$$\displaystyle{\gamma = 1,a = qC_{2},b = pC_{1},p^{p}C_{ 1}^{p} = q^{q}C_{ 2}^{q},}$$

from where one finds

$$\displaystyle{ C_{1} = \frac{1} {p}p^{\frac{1} {p} },\,C_{2} = \frac{1} {q}p^{\frac{1} {q} }. }$$
(62)

Therefore,

$$\displaystyle{ B(u,v) = \frac{1} {p}p^{\frac{1} {p} }t^{\frac{1} {q} }u + \frac{1} {q}p^{\frac{1} {q} }t^{\frac{1} {p} }v - uv, }$$
(63)

where t is defined from

$$\displaystyle{ t = \frac{1} {q}p^{\frac{1} {p} }t^{\frac{1} {q} }u + \frac{1} {p}p^{\frac{1} {q} }t^{\frac{1} {p} }v. }$$
(64)

If we specify \(p = 3,q = \frac{3} {2}\) we get

$$\displaystyle\begin{array}{rcl} C_{1} = \frac{1} {3}3^{\frac{1} {3} },\,C_{2} = \frac{2} {3}3^{\frac{2} {3} }.& &{}\end{array}$$
(65)
$$\displaystyle\begin{array}{rcl} t^{\frac{2} {3} } = \frac{2} {3}3^{\frac{1} {3} }t^{\frac{1} {3} }u + \frac{1} {3}3^{\frac{2} {3} }v,& &{}\end{array}$$
(66)

and solving the quadratic equation on \(s:= t^{\frac{1} {3} }\): \(s^{2} - 2C_{1}us -\frac{C_{2}} {2} v = 0\), we get (the right root will be with + sign)

$$\displaystyle{ t^{\frac{1} {3} }(u,v) = s = C_{1}u + \sqrt{C_{1 }^{2 }u^{2 } + \frac{C_{2 } } {2} v}. }$$
(67)

Therefore, B(u, v) being equal to C 1 s 2 u + C 2 svuv is (\(C_{1}C_{2} = \frac{2} {3}\), see (61))

$$\displaystyle{B(u,v) = C_{1}u(2C_{1}us+\frac{C_{2}} {2} v)+C_{2}vs-uv = (2C_{1}^{2}u^{2}+C_{ 2}v)s+\frac{1} {2}C_{1}C_{2}uv-uv,}$$

and so

$$\displaystyle\begin{array}{rcl} & B(u,v) = (2C_{1}^{2}u^{2} + C_{2}v)(C_{1}u + \sqrt{C_{1 }^{2 }u^{2 } + \frac{C_{2 } } {2} v}) -\frac{2} {3}uv, & {}\\ & = (2C_{1}^{2}u^{2} + C_{2}v)\sqrt{C_{1 }^{2 }u^{2 } + \frac{C_{2 } } {2} v} + 2C_{1}^{3}u^{3} + (C_{ 1}C_{2} -\frac{2} {3})uv.& {}\\ \end{array}$$

The last term disappears (see (61)), and we get

$$\displaystyle{B(u,v) = 2(C_{1}^{2}u^{2}+\frac{C_{2}} {2} v)\sqrt{C_{1 }^{2 }u^{2 } + \frac{C_{2 } } {2} v}+2C_{1}u^{3} = 2C_{ 1}^{3}(u^{2}+ \frac{C_{2}} {2C_{1}}v)^{\frac{3} {2} }+2C_{1}^{3}u^{3}.}$$

Finally from (65)

$$\displaystyle{ B(u,v) = \frac{2} {9}((u^{2} + 3v)^{\frac{3} {2} } + u^{3}). }$$
(68)

This is exactly the function in (35). This function gave us our main theorem for p = 3. We have just explained how we got it.

By the way, in this particular case the transcendental equation on γ becomes the usual cubic equation on \(\sqrt{\gamma }\): \(2\sqrt{\gamma } + 1 = 4 -\frac{1} {\gamma }\), which has only one real solutions γ = 1.

9 Explanation: Pogorelov’s Theorem Again

We owe the reader the explanation, why we chose the function A(u, v) = B(u, v) + uv rather than A(u, v) = B(u, v) − uv to have the degenerate Hessian form.

We want to find a function satisfying the following properties (in what follows p ≥ 2):

  • 1) B is defined in the whole plane \(\mathbb{R}^{2}\) and B(u, v) = B(−u, v) = B(u, −v);

  • 2) \(0 \leq B(u,v) \leq \phi (u,v) = (\,p - 1)(\frac{1} {p}\vert u\vert ^{p} + \frac{1} {q}\vert v\vert ^{q})\);

  • 3) Everywhere we have inequality for Hessian quadratic form d 2 B(u, v) ≥ 2 | du | | dv | ;

  • 4) Homogeneity: B(c 1∕p u, c 1∕q v) = cB(u, v), c > 0;

  • 5) Function B should be the “best” one satisfying 1), 2), 3).

  1. (i)

    What do we mean by best function? We would like B to be the ‘largest’ function below ϕ(u, v) such that the convexity condition in 3) holds. We expect that such a function should equal the upper bound ϕ(u, v) at some point(s) and the inequality in 3) should be equality where possible.

  2. (ii)

    Due to the symmetry in 1), we can restrict our attention to {u > 0, v > 0}.

  3. (iii)

    If we have at some (u, v), B(u, v) = ϕ(u, v), then condition 4) implies that B(c 1∕p u, c 1∕q v) = cB(u, v) = c ϕ(u, v) = ϕ(c 1∕p u, c 1∕q v). Hence they remain equal on a curve {(u, v): v q = γ q u p} for some γ.

  4. (iv)

    The condition < d 2 B ⋅ (u, v), (u, v) > ≥ 2 | u | | v | means that the ‘directional convexity’ in direction (u, v) stays above the value 2 | u | | v | . This means that the directional convexity of B is above that of both the functions uv and − uv. Equivalently we are asserting the positive definiteness of the matrices:

    $$\displaystyle{ \left (\begin{array}{*{10}c} B_{uu} &B_{uv} - 1 \\ B_{vu} - 1& B_{vv}\end{array} \right ) \geq 0,\left (\begin{array}{*{10}c} B_{uu} &B_{uv} + 1 \\ B_{vu} + 1& B_{vv}\end{array} \right ) \geq 0. }$$
    (69)
  5. (v)

    In order to optimize (69), we require that one of the matrices is degenerate (with  = 0‴). Suppose that the first matrix is degenerate. This means that the function A(u, v) = B(u, v) − uv has a degenerate Hessian. At every point, one of its two non-negative eigenvalues is 0, and the function has 0 convexity in the direction of corresponding eigenvector. Since the matrix is positive definite, it follows that 0 is the minimal eigenvalue, hence the graph of this function is a surface with gaussian curvature 0.

    Moreover the directional convexity of Buv is greater than that of B + uv in directions of negative slope and less than in directions of positive slope. If we want B + uv to have non-degenerate positive Hessian, then the degeneracy of Buv must occur in the positive slope direction.

Let us analyze the function A(u, v) = B(u, v) − uv. A theorem of Pogorolev tells us that A will be a linear function on lines of degeneracy. That is, it will have the form:

$$\displaystyle{ A(u,v) = t_{1}u + t_{2}v - t }$$
(70)

where t 1(u, v), t 2(u, v) and t(u, v) are constant on the lines given by

$$\displaystyle{ \frac{dt_{1}} {dt} u + \frac{dt_{2}} {dt} v - 1 = 0. }$$
(71)

We can say two things about the coefficient functions, that the eigen-lines that intersect the positive y axis must also have \(\frac{dt_{1}} {dt_{2}} \leq 0\) and \(\frac{dt_{2}} {dt} \geq 0\) - this information comes from (71) and the fact that the eigen-lines have positive slope. At the moment we know nothing else about the coefficient functions. We will use the various boundary conditions on B, hence on A to determine them.

  1. (i)

    First observe that since B(u, v) = B(−u, v) = B(u, −v), we may expect that B is smooth on at least one of the two axes, assume on the y axis, and hence the corresponding derivative u B(0, v) = 0. This means:

    $$\displaystyle{ \partial _{u}A(0,v) = -v. }$$
    (72)
  2. (ii)

    We already assumed that

    $$\displaystyle{ B(u,v) =\phi (u,v) = (\,p - 1)(\frac{u^{p}} {p} + \frac{v^{q}} {q} ) }$$
    (73)

    on some curve \(\Gamma =\{ v^{q} =\gamma ^{q}u^{p}\}\).

  3. (iii)

    Let us also assume that the tangent planes of B and ϕ agree on \(\Gamma\). This means that the gradients of the two functions B(u, v) − z and ϕ(u, v) − z should be parallel at the points (u, v, ϕ(u, v)) where \((u,v) \in \Gamma\). Therefore

    $$\displaystyle{(\partial _{u}\phi,\partial _{v}\phi,-1) =\lambda (\partial _{u}B,\partial _{v}B,-1),}$$

    which implies λ = 1 and

    $$\displaystyle{ B_{u}(u,v) = (\,p - 1)u^{p-1},B_{ v}(u,v) = (\,p - 1)v^{q-1} }$$
    (74)

    on the curve \(\Gamma\). Similarly on \(\Gamma\),

    $$\displaystyle{ A_{u}(u,v) = (\,p - 1)u^{p-1} - v,A_{ v}(u,v) = (\,p - 1)v^{q-1} - u. }$$
    (75)

Recall:

$$\displaystyle{ A(u,v) = t_{1}u + t_{2}v - t }$$
(76)

where t 1(u, v) = A u (u, v), t 2(u, v) = A v (u, v) and t(u, v) are constant on the lines given by

$$\displaystyle{ \frac{dt_{1}} {dt} u + \frac{dt_{2}} {dt} v - 1 = 0. }$$
(77)

We also have the homogeneity condition: A(c 1∕p u, c 1∕q v) = cA(u, v). Differentiating this with respect to c and setting c = 1 gives:

$$\displaystyle\begin{array}{rcl} A(u,v)& =& \frac{1} {p}A_{u}(u,v)u + \frac{1} {q}A_{v}(u,v)v{}\end{array}$$
(78)
$$\displaystyle\begin{array}{rcl} & =& \frac{1} {p}t_{1}u + \frac{1} {q}t_{2}v.{}\end{array}$$
(79)

Comparing (76) and (79), we have

$$\displaystyle{ \frac{1} {q}t_{1}u + \frac{1} {p}t_{2}v - t = 0. }$$
(80)

Now comparing (77) and (80) gives

$$\displaystyle{ \frac{dt_{1}} {dt} = \frac{1} {q} \frac{t_{1}} {t}, \frac{dt_{2}} {dt} = \frac{1} {p} \frac{t_{2}} {t}. }$$
(81)

Solving these differential equations, we have

$$\displaystyle{ t_{1}(t) = C_{1}\vert t\vert ^{1/q},t_{ 2}(t) = C_{2}\vert t\vert ^{1/p}. }$$
(82)

Putting this into (80) gives:

$$\displaystyle{ t = \frac{1} {q}C_{1}\vert t\vert ^{1/q}u + \frac{1} {p}C_{2}\vert t\vert ^{1/p}v }$$
(83)

Let us make two observations: Recall that if our eigen-line intercepts the positive y axis and has positive slope, then \(\frac{dt_{1}} {dt_{2}} = \frac{q} {p} \frac{C_{1}} {C_{2}} \vert t\vert ^{\frac{1} {q}-\frac{1} {p} } \leq 0\) and \(\frac{dt_{2}} {dt} \geq 0\). If t > 0, then \(\frac{dt_{2}} {dt} = \frac{1} {p}C_{2}\vert t\vert ^{-1/q}\), and if t < 0, then \(\frac{dt_{2}} {dt} = -\frac{1} {p}C_{2}\vert t\vert ^{-1/q}\). We conclude from this:

  1. (i)

    If t > 0, then C 1 C 2 ≤ 0 and C 2 ≥ 0, hence C 1 ≤ 0,

  2. (ii)

    If t < 0, then C 1 C 2 ≤ 0 and C 2 ≤ 0, hence C 1 ≥ 0.

Let us bring in the following: t 1 = A u (0, v) = −v. The first equality is from Pogorolev and the second is the boundary condition (72). Then (82) implies that

$$\displaystyle{ -v = C_{1}\vert t(0,v)\vert ^{1/q} }$$
(84)

and (83) implies that

$$\displaystyle{ t(0,v) = \frac{1} {p}C_{2}\vert t(0,v)\vert ^{1/p}v. }$$
(85)

Conclude:

  1. (i)

    If v > 0, then C 1 < 0. The previous observations imply t > 0 and C 2 ≥ 0. We are concerned at present with this case of positive y intercept.

  2. (ii)

    From (84) and (85), we conclude

    $$\displaystyle{ C_{1}C_{2} = -p. }$$
    (86)

Next from (75), we know that on \(\Gamma\),

$$\displaystyle{ t_{1} = (\,p-1)u^{p-1} -v = (\frac{p - 1} {\gamma } -1)v,t_{2} = (\,p-1)v^{q-1} -u = ((\,p-1)\gamma ^{q-1} -1)u. }$$
(87)

In terms of t, this says

$$\displaystyle{ C_{1}t^{1/q} = (\frac{p - 1} {\gamma } - 1)v,C_{2}t^{1/p} = ((\,p - 1)\gamma ^{q-1} - 1)u. }$$
(88)

Write on \(\Gamma\)

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} t^{\frac{1} {q} } = aC_{2}v\quad \\ t^{\frac{1} {p} } = bC_{1}u\quad \end{array} \right. }$$
(89)

Note that a ≥ 0 and b ≤ 0 due to the signs of C 1 and C 2. Substituting in (88) and using (86) gives

$$\displaystyle{ a = \frac{1} {p} -\frac{1} {q\gamma },b = \frac{1} {p} -\frac{1} {q}\gamma ^{q-1}. }$$
(90)

Note that (89) also implies that

$$\displaystyle{ \frac{a^{q}C_{2}^{q}} {\vert b\vert ^{p}\vert C_{1}\vert ^{p}} = \frac{1} {\gamma ^{q}} }$$
(91)

Hence (90) and (91) imply

$$\displaystyle{ ( \frac{\gamma } {p} -\frac{1} {q})^{q}C_{ 2}^{q} = (\frac{1} {q}\gamma ^{q-1} -\frac{1} {p})^{p}\vert C_{ 1}\vert ^{p}. }$$
(92)

(92), (86) and the fact pq = p + q imply that

$$\displaystyle{ C_{2} = \left (\frac{p((\,p - 1)\gamma ^{ \frac{1} {p-1} } - 1)^{p-1}} {\gamma -(\,p - 1)} \right )^{\frac{1} {p} }. }$$
(93)

Next observe that (83), (86) and (89) imply

$$\displaystyle{ ab = \frac{1} {q}a + \frac{1} {p}b }$$
(94)

and hence by (90)

$$\displaystyle{ (\frac{1} {p} -\frac{1} {q\gamma })(\frac{1} {p} -\frac{1} {q}\gamma ^{q-1}) = \frac{1} {pq} + \frac{1} {p^{2}} - \frac{1} {q^{2}\gamma } - \frac{1} {pq}\gamma ^{q-1} }$$
(95)

The equation that follows from making substitutions into the boundary condition (73) B = ϕ on \(\Gamma\) and A = Buv gives no new relationship. So we can avoid its consideration.

Simplifying (95) shows that γ is solution to the equation

$$\displaystyle{ \gamma ^{q-1} - (q - 1)\gamma + 2 - q = 0. }$$
(96)

The rest of the analysis is yet to be done. However, note that \(\frac{B_{u}} {u} = \frac{\phi _{u}} {u}\) on \(\Gamma\), and on the corresponding eigen-line, we can understand it by using the fact that A u  = B u v is constant. This may help later.

10 The Case When p = 3 and \(q = \frac{3} {2}\)

Observe that by setting δ = γ q−1, we can rewrite (96) as

$$\displaystyle{ \delta ^{p-1} - (\,p - 1)\delta + 2 - p = 0. }$$
(97)

Let us analyze the case when p = 3. Then this equation becomes

$$\displaystyle{ \delta ^{2} - 2\delta - 1 = 0 }$$
(98)

whose unique positive solution is \(\delta = 1 + \sqrt{2}\). Therefore

$$\displaystyle{ \gamma = (1 + \sqrt{2})^{2} = 3 + 2\sqrt{2}. }$$
(99)

Then using (86), (90) and (93), we obtain

$$\displaystyle{ a = -\frac{5} {3} + \frac{4\sqrt{2}} {3},b = \frac{1} {3} -\frac{2} {3}(3 + 2\sqrt{2})^{1/2} }$$
(100)

and

$$\displaystyle{ C_{1} = \frac{-3^{\frac{2} {3} }(1 + 2\sqrt{2})^{1/3}} {(2\sqrt{3 + 2\sqrt{2}} - 1)^{2/3}},C_{2} = \frac{3^{1/3}(2\sqrt{3 + 2\sqrt{2}} - 1)^{2/3}} {(1 + 2\sqrt{2})^{1/3}} }$$
(101)

Now we will explicitly find B(u, v). Recall

$$\displaystyle\begin{array}{rcl} B(u,v)& =& \frac{1} {p}t_{1}u + \frac{1} {q}t_{2}v + uv {}\\ & =& \frac{C_{1}} {p} t^{1/q}u + \frac{C_{2}} {q} t^{1/p}v + uv {}\\ & =& \frac{C_{1}} {3} t^{2/3}u + \frac{2C_{2}} {3} t^{1/3}v + uv. {}\\ \end{array}$$
$$\displaystyle\begin{array}{rcl} t& =& \frac{1} {q}C_{1}t^{1/q}u + \frac{1} {p}C_{2}t^{1/p}v {}\\ & =& \frac{2} {3}C_{1}t^{2/3}u + \frac{1} {3}C_{2}t^{1/3}v. {}\\ \end{array}$$

Let s = t 1∕3. Then we have \(s^{2} -\frac{2} {3}C_{1}us -\frac{1} {3}C_{2}v = 0\) and

$$\displaystyle{s = \frac{C_{1}} {3} u + \frac{1} {3}\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v}.}$$
$$\displaystyle\begin{array}{rcl} B(u,v)& =& \frac{C_{1}} {3} s^{2}u + \frac{2} {3}C_{2}sv + uv {}\\ & =& \frac{C_{1}} {3} \left (\frac{2C_{1}^{2}} {9} u^{2} + \frac{3C_{2}v} {9} + \frac{2C_{1}u} {9} \sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v}\right )u {}\\ & & +\frac{2} {9}C_{1}C_{2}uv + \frac{2} {9}C_{1}C_{2}uv + \frac{2} {9}C_{2}v\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v} + uv. {}\\ \end{array}$$

Use the fact that C 1 C 2 = −3 to simplify and obtain:

$$\displaystyle{ B(u,v) = \frac{2} {27}(C_{1}^{2}u^{2} + 3C_{ 2}v)^{3/2} + \frac{2} {27}C_{1}^{3}u^{3}. }$$
(102)
$$\displaystyle\begin{array}{rcl} B_{u}& =& \frac{2} {9}C_{1}^{3}u^{2} + \frac{1} {9}(C_{1}^{2}u^{2} + 3C_{ 2}v)^{1/2}2C_{ 1}^{2}u \\ B_{v}& =& \frac{C_{2}} {3} (C_{1}^{2}u^{2} + 3C_{ 2}v)^{1/2} \\ B_{uu}& =& \frac{2} {9}C_{1}^{2}\left [\frac{(\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v} + C_{1}u)^{2}} {\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v}} \right ] \\ B_{uv}& =& \frac{\vert C_{1}\vert uv} {\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v}} \\ B_{vv}& =& \frac{C_{2}^{2}} {2\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v}} \\ \tau &:=& \sqrt{\frac{B_{uu } } {B_{vv}}} = \frac{2} {3} \frac{\vert C_{1}\vert } {C_{2}} (\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v} + C_{1}u){}\end{array}$$
(103)
$$\displaystyle\begin{array}{rcl} \frac{1} {\tau } & =& \frac{\sqrt{C_{1 }^{2 }u^{2 } + 3C_{2 } v} - C_{1}u} {2\vert C_{1}\vert v} \\ \frac{B_{u}} {u} & =& \frac{2} {9}C_{1}^{3}u + \frac{1} {9}(C_{1}^{2}u^{2} + 3C_{ 2}v)^{1/2}2C_{ 1}^{2} \\ \frac{B_{v}} {v} & =& \frac{C_{2}(C_{1}^{2}u^{2} + 3C_{2}v)^{1/2}} {3v}. {}\end{array}$$
(104)

We can use | C 1 | C 2 = 3 to deduce \(\frac{B_{u}} {u} =\tau\). Next we compute the quadratic form associated with B by using the formulation before:

$$\displaystyle\begin{array}{rcl} & & Q(dx,dy) = B_{uu}dx^{2} + 2B_{ uv}dxdy + B_{vv}dy^{2} {}\\ & =& \left (\sqrt{B_{uu } B_{vv}} -\vert B_{uv}\vert \right )\left (\sqrt{\frac{B_{uu } } {B_{vv}}} dx^{2} + \sqrt{ \frac{B_{vv } } {B_{uu}}}dy^{2}\right ) + \vert B_{ uv}\vert \sqrt{\frac{B_{uu } } {B_{vv}}} \left (dx +\mathop{ \mathrm{sign}}\nolimits (B_{uv})\sqrt{ \frac{B_{vv } } {B_{uu}}}dy\right )^{2} {}\\ & =& \left (\sqrt{B_{uu } B_{vv}} -\vert B_{uv}\vert \right )\left (\tau dx^{2} + \frac{1} {\tau } dy^{2}\right ) + \vert B_{ uv}\vert \tau \left (dx +\mathop{ \mathrm{sign}}\nolimits (B_{uv})\frac{1} {\tau } dy\right )^{2}. {}\\ \end{array}$$

Now let \(\mathbf{B}(y_{11},y_{12},y_{21},y_{22}):= B(\sqrt{y_{11 }^{2 } + y_{12 }^{2}},\sqrt{y_{11 }^{2 } + y_{12 }^{2}}) = B(x_{1},x_{2})\). Then the associated quadratic form becomes

$$\displaystyle\begin{array}{rcl} d^{2}\mathbf{B}& =& \tau \Big(\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} \Big)^{2} + \frac{1} {\tau } \Big(\frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big) {}\\ & +& \frac{\tau \vert C_{1}\vert x_{1}} {\sqrt{C_{1 }^{2 }x_{1 }^{2 } + 3C_{2 } x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2} {}\\ & +& \Big(\frac{B_{u}} {u} =\tau \Big)\Big(\frac{y_{12}dy_{11} - y_{11}dy_{12}} {x_{1}} \Big)^{2} {}\\ & +& \Big(\frac{B_{v}} {v} = \frac{C_{2}(C_{1}^{2}x_{1}^{2} + 3C_{2}x_{2})^{1/2}} {3x_{2}} \Big)\Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & =& \tau (dy_{11}^{2} + dy_{ 12}^{2}) + \frac{1} {\tau } (dy_{21}^{2} + dy_{ 22}^{2}) {}\\ & +& \Big(\frac{C_{2}(C_{1}^{2}x_{1}^{2} + 3C_{2}x_{2})^{1/2}} {3x_{2}} -\frac{1} {\tau } \Big)\Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & +& \frac{\tau \vert C_{1}\vert x_{1}} {\sqrt{C_{1 }^{2 }x_{1 }^{2 } + 3C_{2 } x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2} {}\\ & =& \tau (dy_{11}^{2} + dy_{ 12}^{2}) + \frac{1} {\tau } (dy_{21}^{2} + dy_{ 22}^{2}) {}\\ & +& \Big( \frac{3C_{2}\tau } {4C_{1}^{2}x_{2}}\Big)\Big(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \Big)^{2} {}\\ & +& \frac{\tau \vert C_{1}\vert x_{1}} {\sqrt{C_{1 }^{2 }x_{1 }^{2 } + 3C_{2 } x_{2}}}\Big[\frac{y_{11}dy_{11} + y_{12}dy_{12}} {x_{1}} + \frac{1} {\tau } \frac{y_{21}dy_{21} + y_{22}dy_{22}} {x_{2}} \Big]^{2}. {}\\ \end{array}$$

In order for the quadratic form to have the self-improving property, we need

$$\displaystyle{ \frac{3C_{2}\tau } {4C_{1}^{2}x_{2}} + \frac{2} {\tau } \geq \frac{c} {\tau } }$$
(105)

for suitable constant c. In fact if \(\frac{C_{2}} {C_{1}^{2}} = 1\), we know that c = 3. This suggests that the right constant is \(2 + \frac{C_{2}} {C_{1}^{2}} \approx 3.276142375\). (Calculation gives | C 1 | ≈ 1. 329660319 and C 2 ≈ 2. 256215334, hence \(\frac{C_{2}} {C_{1}^{2}} \approx 1.276142375\).)

If the rest of the process is the same as with the previous estimate, then the over all constant estimate would be approximately

$$\displaystyle{ \frac{2\sqrt{2}} {\sqrt{3.276142375}} \approx 1.562656814.}$$

11 The Proof of Theorem 7 for General q ∈ (1, 2]

Recall that we found for 1 < q ≤ 2 ≤ p < , 1∕p + 1∕q = 1 the following function

$$\displaystyle\begin{array}{rcl} B(u,v) = B_{q}(u,v) = \frac{p^{\frac{1} {p} }} {p} t^{\frac{1} {q} }u + \frac{p^{\frac{1} {q} }} {q} t^{\frac{1} {p} }v - uv,\,\,\text{where}& &{}\end{array}$$
(106)
$$\displaystyle\begin{array}{rcl} t = t(u,v)\,\,\text{is the solution of}\,\,t = \frac{p^{\frac{1} {p} }} {q} t^{\frac{1} {q} }u + \frac{p^{\frac{1} {q} }} {p} t^{\frac{1} {p} }v.& &{}\end{array}$$
(107)

Our goal is to represent the Hessian form of this implicitly given B as a sum of squares. This requires some calculations.

$$\displaystyle\begin{array}{rcl} B_{u} = \frac{p^{\frac{1} {p} }} {p} t^{\frac{1} {q} } - v + \frac{1} {pq}S\frac{t'_{u}} {t},& &{}\end{array}$$
(108)
$$\displaystyle\begin{array}{rcl} B_{v} = \frac{p^{\frac{1} {q} }} {q} t^{\frac{1} {p} } - u + \frac{1} {pq}S\frac{t'_{v}} {t},& &{}\end{array}$$
(109)

where

$$\displaystyle{ S:= p^{\frac{1} {p} }t^{\frac{1} {q} }u + p^{\frac{1} {q} }t^{\frac{1} {p} }v. }$$
(110)

Also

$$\displaystyle{t'_{u} = \frac{p^{\frac{1} {p} }} {p} t^{\frac{1} {q} } \cdot \frac{t} {t -\frac{p^{\frac{1} {p} }} {q^{2}} t^{\frac{1} {q} }u -\frac{p^{\frac{1} {q} }} {p^{2}} t^{\frac{1} {p} }v},}$$

which, after using (107), (110) gives

$$\displaystyle{ \frac{t'_{u}} {t} = p \cdot p^{\frac{1} {p} }\frac{t^{\frac{1} {q} }} {S}. }$$
(111)

Similarly,

$$\displaystyle{ \frac{t'_{v}} {t} = q \cdot p^{\frac{1} {q} }\frac{t^{\frac{1} {p} }} {S}. }$$
(112)

Recall also that we had

$$\displaystyle{ A = A(u,v) = \frac{p^{\frac{1} {p} }} {p} t^{\frac{1} {q} }u + \frac{p^{\frac{1} {q} }} {q} t^{\frac{1} {p} }v. }$$
(113)

Using the notations (110) and (113) we can compute the Hessian of B = B q . Namely,

$$\displaystyle\begin{array}{rcl} & B_{uu} = \frac{2p^{\frac{1} {p} }} {pq} t^{\frac{1} {q} }\frac{t'_{u}} {t} - \frac{1} {pq}A(\frac{t'_{u}} {t} )^{2} + \frac{1} {pq}S\frac{t''_{uu}} {t}. & {}\\ & B_{vv} = \frac{2p^{\frac{1} {q} }} {pq} t^{\frac{1} {p} }\frac{t'_{v}} {t} - \frac{1} {pq}A(\frac{t'_{v}} {t} )^{2} + + \frac{1} {pq}S\frac{t''_{vv}} {t}.& {}\\ & B_{uv} = \frac{p\,t} {S} - \frac{1} {pq}A\frac{t'_{u}t'_{v}} {t^{2}} + + \frac{1} {pq}S\frac{t''_{uv}} {t} & {}\\ \end{array}$$

Plugging

$$\displaystyle{\frac{t''_{uu}} {t} = (\frac{1} {q} + 1 -\frac{1} {p})(\frac{t'_{u}} {t} )^{2} - \frac{t} {S}(\frac{t'_{u}} {t} )^{2}}$$

and using (111) we get the following concise formulas:

$$\displaystyle\begin{array}{rcl} B_{uu} = \frac{1} {pq}S(\frac{t'_{u}} {t} )^{2}.& &{}\end{array}$$
(114)
$$\displaystyle\begin{array}{rcl} B_{vv} = \frac{1} {pq}S(\frac{t'_{v}} {t} )^{2}.& &{}\end{array}$$
(115)
$$\displaystyle\begin{array}{rcl} B_{uv} + 1 = \frac{1} {pq}S\frac{t'_{u}t'_{v}} {t^{2}}.& &{}\end{array}$$
(116)

Let us introduce the notations:

$$\displaystyle{\alpha = \frac{t'_{u}} {t},\,\beta = \frac{t'_{v}} {t},\,m = \frac{1} {pq}S,\,\tau = \frac{\alpha } {\beta }.}$$

Then we saw in the previous sections that the Hessian quadratic form of B

$$\displaystyle{Q(dx_{1},dx_{2}) = B_{uu}dx_{1}^{2} + 2(B_{ uv} + 1)dx_{1}dx_{2} + B_{vv}dx_{2}^{2}}$$

will have the form

$$\displaystyle{ Q = \frac{\alpha } {\beta }dx_{1}^{2} + \frac{\beta } {\alpha }dx_{2}^{2} + \frac{\alpha } {\beta }(m\alpha \beta - 1)(dx_{1} + \frac{\beta } {\alpha }dx_{2})^{2}. }$$
(117)

It is useful if the reader thinks that in what follows y 11, y 12, y 21, y 22 are, correspondingly, \(\Phi,\Psi,U,V\).

Also in what follows dy 11, dy 12, dy 21, dy 22 can be viewed as ϕ 1, ψ 1, u 1, v 1 and ϕ 2, ψ 2, u 2, v 2.

Our goal now is to “tensorize” the form Q. This operation means in our particular case to consider the new function, now of 4 real variables (or 2 complex variables if one prefers), given by

$$\displaystyle{\mathcal{B}:= \mathcal{B}(y_{11},y_{12},y_{21},y_{22})\,:=\,B(x_{1},x_{2}),\,\text{where}\,\,x_{1}:= \sqrt{y_{11 }^{2 } + y_{12 }^{2}},x_{2}:= \sqrt{y_{21 }^{2 } + y_{22 }^{2}}}$$

and to write its Hessian quadratic form. In the previous section we saw the formula for doing that:

$$\displaystyle\begin{array}{rcl} & \mathbb{Q} = \frac{\alpha }{\beta }\bigg(\frac{y_{11}dy_{11}+y_{12}dy_{12}} {x_{1}} \bigg)^{2} + \frac{\beta }{\alpha }\bigg(\frac{y_{21}dy_{21}+y_{22}dy_{22}} {x_{2}} \bigg)^{2}+& {}\\ & \frac{\alpha }{\beta }(m\alpha \beta - 1)\bigg(\frac{y_{11}dy_{11}+y_{12}dy_{12}} {x_{1}} + \frac{\beta }{\alpha } \frac{y_{21}dy_{21}+y_{22}dy_{22}} {x_{2}} \bigg)^{2}+& {}\\ & \frac{B_{u}} {u} \bigg(\frac{y_{12}dy_{11}-y_{11}dy_{12}} {x_{1}} \bigg)^{2} + \frac{B_{v}} {v} \bigg(\frac{y_{22}dy_{21}-y_{21}dy_{22}} {x_{2}} \bigg)^{2}. & {}\\ \end{array}$$

To show that this quadratic form has an interesting self-improving property we are going to make some calculations. First of all notice that

$$\displaystyle{ \tau = \frac{\alpha } {\beta } = \frac{p \cdot p^{\frac{1} {p} } \cdot t^{\frac{1} {q} }} {q \cdot p^{\frac{1} {q} } \cdot t^{\frac{1} {p} }} }$$
(118)

Now we start with combining (108) with (111)

$$\displaystyle{ B_{u} = \frac{p^{\frac{1} {p} }} {p} t^{\frac{1} {q} } - v + \frac{1} {pq}S\frac{p \cdot p^{\frac{1} {p} } \cdot t^{\frac{1} {q} }} {S} = p^{\frac{1} {p} }t^{\frac{1} {q} } - v. }$$
(119)

Let us see that

$$\displaystyle{ \frac{p} {q} \frac{p^{\frac{1} {p} }} {p^{\frac{1} {q} }} \frac{t^{\frac{1} {q} }} {t^{\frac{1} {p} }} = \frac{p^{\frac{1} {p} }t^{\frac{1} {q} }} {u} -\frac{v} {u}. }$$
(120)

This is the same as

$$\displaystyle{p \cdot p^{\frac{1} {p} } \cdot t^{\frac{1} {q} }u = qpt - q \cdot p^{\frac{1} {q} } \cdot t^{\frac{1} {p} }v.}$$

But the last claim is correct, it is just the implicit equation (107) for t. So (120) is correct. So, combining (118) and (119) we obtain

$$\displaystyle{ \frac{B_{u}} {u} = \frac{\alpha } {\beta }. }$$
(121)

We would expect that \(\frac{B_{v}} {v} = \frac{\beta } {\alpha } = \frac{1} {\tau }\) by symmetry, but actually \(\frac{B_{v}} {v}> \frac{\beta } {\alpha }\) for p > 2 and this allows us to have an improved inequality for \(\mathbb{Q}\). Let us see how.

Using (107) we get

$$\displaystyle\begin{array}{rcl} & \frac{B_{v}} {v} -\frac{\beta } {\alpha } = \frac{p^{\frac{1} {q} }t^{\frac{1} {p} }} {v} -\frac{u} {v} -\frac{q\cdot p^{\frac{1} {q} }\cdot t^{\frac{1} {p} }} {p\cdot p^{\frac{1} {p} }\cdot t^{\frac{1} {q} }} =& {}\\ & \frac{p^{2}t-p\cdot p^{\frac{1} {p} }\cdot t^{\frac{1} {q} }u-q\cdot p^{\frac{1} {q} }\cdot t^{\frac{1} {p} }v} {p\cdot p^{\frac{1} {p} }\cdot t^{\frac{1} {q} }v} = & {}\\ & \frac{(\,p^{2}-pq)t} {p\cdot p^{\frac{1} {p} }\cdot t^{\frac{1} {q} }v} = \frac{p-q} {p^{\frac{1} {p} }} \frac{t^{\frac{1} {p} }} {v} = & {}\\ & \frac{p-q} {p^{\frac{1} {p} }} \frac{\frac{p^{\frac{1} {p} }} {q} t^{\frac{1} {q} }u+\frac{p^{\frac{1} {q} }} {p} t^{\frac{1} {p} }v} {t^{\frac{1} {q} }v} = & {}\\ & \frac{(1-q/p)} {p^{\frac{1} {p}-\frac{1} {q} }} t^{\frac{1} {p}-\frac{1} {q} } +\bigg (\frac{p} {q} - 1\bigg). & {}\\ \end{array}$$

In particular, using (118)

$$\displaystyle\begin{array}{ccc} & \frac{B_{v}} {v} -\frac{\beta } {\alpha } + \frac{2} {\tau } \geq \frac{(1-q/p)} {p^{\frac{1} {p}-\frac{1} {q} }} t^{\frac{1} {p}-\frac{1} {q} } + 2\frac{q\cdot p^{\frac{1} {q} }\cdot t^{\frac{1} {p} }} {p\cdot p^{\frac{1} {p} }\cdot t^{\frac{1} {q} }} =& {}\\ & \frac{(1-q/p)} {p^{\frac{1} {p}-\frac{1} {q} }} t^{\frac{1} {p}-\frac{1} {q} } + \frac{2q/p} {p^{\frac{1} {p}-\frac{1} {q} }}t^{\frac{1} {p}-\frac{1} {q} } = & {}\\ &\frac{q} {p^{\frac{1} {p}-\frac{1} {q} }}t^{\frac{1} {p}-\frac{1} {q} } = p \cdot \frac{1} {\tau }. & {}\\ \end{array}$$

This is what we need

$$\displaystyle{ \frac{B_{v}} {v} -\frac{\beta } {\alpha } + \frac{2} {\tau } = p \cdot \frac{1} {\tau } + (\,p/q - 1)\frac{u} {v} \geq p \cdot \frac{1} {\tau }. }$$
(122)

Now let us take a look at \(\mathbb{Q}\) and let us plug (121) and (122) into it. Then

$$\displaystyle{ \mathbb{Q} \geq \tau (dy_{11}^{2} + dy_{ 12}^{2}) + \frac{1} {\tau } (dy_{21}^{2} + dy_{ 22}^{2}) + (\frac{B_{v}} {v} -\frac{\beta } {\alpha })\bigg(\frac{y_{22}dy_{21} - y_{21}dy_{22}} {x_{2}} \bigg)^{2}. }$$
(123)

Now imagine that we apply this estimate to two different collection of vectors (dy 11, dy 12, dy 21, dy 22), (dy11, dy12, dy21, dy22). Moreover, suppose that we have orthonormality condition

$$\displaystyle{ dy_{21} \cdot dy_{22} + dy'_{21} \cdot dy'_{22} = 0,dy_{21}^{2} + (dy'_{ 21})^{2} = dy_{ 22}^{2} + (dy'_{ 22})^{2}. }$$
(124)

Then we get from (123), (124)

$$\displaystyle\begin{array}{rcl} & \mathbb{Q}(dy) + \mathbb{Q}(dy') \geq \tau (dy_{11}^{2} + dy_{12}^{2} + (dy'_{11})^{2} + (dy'_{12})^{2}) + 1/\tau (dy_{21}^{2} + dy_{22}^{2} + (dy'_{21})^{2} + (dy'_{22})^{2})+& {}\\ & (\frac{B_{v}} {v} -\frac{\beta } {\alpha })\frac{y_{22}^{2}+y_{ 21}^{2}} {x_{2}^{2}} \frac{(dy_{21}^{2}+dy_{ 22}^{2}+(dy'_{ 21})^{2}+(dy'_{ 22})^{2})} {2}. & {}\\ \end{array}$$

We denote ξ 1 2: = dy 11 2 + dy 12 2 + (dy11)2 + (dy12)2, ξ 2 2: = dy 21 2 + dy 22 2 + (dy21)2 + (dy22)2. Using that \(\frac{y_{22}^{2}+y_{ 21}^{2}} {x_{2}^{2}} = 1\) and (122) we rewrite the RHS and get

$$\displaystyle\begin{array}{rcl} & \mathbb{Q}(dy) + \mathbb{Q}(dy') \geq \tau \cdot \xi _{1}^{2} + \frac{1} {2}(\frac{B_{v}} {v} -\frac{\beta } {\alpha } + \frac{2} {\tau } )\xi _{2}^{2} \geq \tau \cdot \xi _{ 1}^{2} + 1/\tau \cdot \frac{p} {2}\xi _{2}^{2} \geq & \\ & 2\sqrt{\frac{p} {2}}(dy_{11}^{2} + dy_{ 12}^{2} + (dy'_{ 11})^{2} + (dy'_{ 12})^{2})^{\frac{1} {2} }(dy_{21}^{2} + dy_{22}^{2} + (dy'_{21})^{2} + (dy'_{22})^{2})^{\frac{1} {2} }.&{}\end{array}$$
(125)

So we won \(\sqrt{2/p} = \sqrt{\frac{2(q-1)} {q}}\) in comparison with the usual Burkholder estimate, which would be \(\leq \frac{1} {q-1}\). So the estimate for the orthogonal martingale will be \(\leq \sqrt{\frac{2(q-1)} {q}} \cdot \frac{1} {q-1} = \sqrt{ \frac{2} {q(q-1)}}\).

And we get Theorem 7.