1 Introduction and Statement of Main Results

1.1 Background and Main Results

Multiparameter harmonic analysis was introduced in the ’70s and studied extensively in the ’80s, led by Chang, Fefferman, Gundy, Journé, Pipher, Stein and others (see for example [12,13,14,15, 29,30,31,32,33,34, 39, 55, 56, 67, 72]). The theory of multiparameter harmonic analysis is largely influenced by the corresponding theory of one-parameter (classical) harmonic analysis, but is strongly motivated by two different geometric phenomena. First, one naturally encounters families of rectangles and operators that are invariant under different scalings than the standard one (e. g., the operator is invariant under a scaling in each variable separately and not just a uniform scaling of all the variables). Second, the boundary behavior of analytic functions in several complex variables necessitated an understanding of approach regions that behaved differently in each variable separately. Both these naturally lead to the theory of harmonic analysis allowing for a decomposition of functions admitting different, independent behavior in each variable separately.

As in classical harmonic analysis a key ingredient in the theory is the development of the Hardy and BMO spaces, their duality and the connections to atomic decompositions. In the multiparameter setting the geometry alluded to above leads to a more complicated description of product BMO. As demonstrated by Carleson the natural BMO condition on rectangles is not sufficient to characterize the dual of the Hardy space. This necessitates a BMO theory based on arbitrary open sets and leads to numerous geometric challenges in the theory. An important result in the area is Journé’s covering lemma which provides a tool by which the general open sets can be replaced by certain families of rectangles with controlled geometry. After this important ground work was established, the development of multiparameter harmonic analysis followed the lines of obtaining T1 theorems, characterizations of Hardy spaces via non-tangential and radial maximal functions, and Hilbert (Riesz) transforms. More recent developments in the area include the dyadic structures, characterization of product BMO via commutators (see for example [18, 35, 36, 57, 60,61,62, 65, 69, 71, 76]), and the developments of T1 and Tb theorems (see for example [53, 70]). As is well known, the product Hardy space \(H^1 \) has a variety of equivalent norms, in terms of square functions, maximal functions and Hilbert transforms (Riesz transforms in higher dimension), see for example [59, p. 19]. See also the product Hardy spaces and boundedness of product singular integrals in different versions studied in [9, 11, 49, 63, 64]. Recently, Han, the second author, and Lu [43] developed the product Hardy spaces \(H^p(X_1\times X_2)\) on product spaces of homogeneous type (in the sense of Coifman and Weiss [17]) for \(p\le 1\) and close to 1 on product spaces of homogeneous type \(X_1\times X_2\) via the discrete Littlewood–Paley–Stein square function, and proved the duality of \(H^p\) with the Carleson measure type spaces \(CMO^p\), see also the related results in [42] and [46]. Later, the boundedness of singular integrals, the product T1 theorem, and atomic decomposition of \(H^p\) were also studied in [41, 44, 45, 66].

The theory of the classical Hardy space is intimately connected to the Laplacian; changing the differential operator introduces new challenges and directions to explore. In the past 10 years, a theory of Hardy spaces associated to operators was introduced and developed by the first author, Hofmann, McIntosh, Yan and many others (we refer to [26, 27, 50, 51, 54, 74] and the references therein). In [20] the authors first introduced the product Hardy space on the Euclidean setting associated with operators via area functions, and the product BMO space via Carleson measures and proved the duality. We also refer to [25, 73] for the product Hardy spaces on the Euclidean setting associated with operators for the atomic decomposition. Recently, Chen, Ward, Yan and the first and second authors [16] developed the product Hardy spaces \(H^1_{L_1,\,L_2}(X_1\times X_2)\) on \(X_1\times X_2\) (the product spaces of homogeneous type) associated with operators via Littlewood–Paley area functions and atomic decompositions, and studied the boundedness of product singular integrals with non-smooth kernels, the Calderón–Zygmund decomposition and interpolations of \(H^p\), as well as the boundedness of Marcinkiewicz type multipliers. Here \(L_1\) and \(L_2\) are two non-negative self-adjoint operators acting on \(L^2(X_1)\) and \(L^2(X_2)\), respectively, and satisfying Davies–Gaffney estimates, which is known as a rather weak assumption. However, the weak conditions on \(L_1\) and \(L_2\) seem not strong enough for obtaining the other characterizations of product space \(H^1_{L_1,\,L_2}(X_1\times X_2)\), for example, via the “Riesz transforms” or maximal functions, and the decomposition of product BMO space in this setting is not known either.

In this paper, we try to address the characterisation of product Hardy spaces associated with operators via the double Riesz transforms as well as via maximal functions by focusing on a well-known candidate: the Bessel operator.

In 1965, Muckenhoupt and Stein in [68] introduced the harmonic function theory associated with Bessel operator \({\triangle _\lambda }\), defined by,

$$\begin{aligned} {\triangle _\lambda }:=-\frac{d^2}{dx^2}-\frac{2\lambda }{x}\frac{d}{dx},\quad \lambda >0. \end{aligned}$$

The related elliptic partial differential equation is the following “singular Laplace equation”

$$\begin{aligned} \triangle _{t,\,x} (u) :=-\partial _{t}^2u - \partial _{x}^2u-\frac{2\lambda }{x}\partial _{x}u=0 \end{aligned}$$
(1.1)

studied by Weinstein [78], and Huber [52] in higher dimension, where they considered the generalised axially symmetric potentials, and obtained the properties of the solutions of this equation, such as the extension, the uniqueness theorem, and the boundary value problem for certain domains.

If u is a solution of (1.1) then u is said to be \(\lambda \)-harmonic. The function u and the conjugate of u (denoted by v) satisfy the following Cauchy–Riemann type equations

$$\begin{aligned} \partial _{x}u=-\partial _{t}v\ \ {{{\mathrm{and}}}}\ \ \partial _{t}u =\partial _{x}v + {2\lambda \over x} v\ \ {{{\mathrm{in}}}}\ \ {\mathbb {R}}_+\times {\mathbb {R}}_+. \end{aligned}$$
(1.2)

In [68] they developed a theory in the setting of \({\triangle _\lambda }\) which parallels the classical one associated to the standard Laplacian, where results on \({L^p({{\mathbb {R}}}_+,\, dm_\lambda )}\)-boundedness of conjugate functions and fractional integrals associated with \({\triangle _\lambda }\) were obtained for \(p\in [1, \infty )\) and \({dm_\lambda }(x):= x^{2\lambda }\,dx\).

We also point out that Haimo [40] studied the Hankel convolution transforms \(\varphi \sharp _\lambda f\) associated with the Hankel transform in the Bessel setting systematically, which provides a parallel theory to the classical convolution and Fourier transforms. It is well-known that the Poisson integral of f studied in [68] is the Hankel convolution of Poisson kernel with f, see [5].

Since then, many problems based on the Bessel context were studied, such as the boundedness of Bessel Riesz transform, Littlewood–Paley functions, Hardy and BMO spaces associated with Bessel operators, \(A_p\) weights associated with Bessel operators (see, for example, [1, 3, 4, 6,7,8, 22,23,24, 58, 77, 80] and the references therein).

The aim of this paper is to focus on this specific Bessel setting, to establish the equivalent characterizations of product Hardy spaces in terms of the Bessel Riesz transforms, non-tangential and radial maximal functions defined via Poisson and heat semigroups and to obtain the decomposition of product BMO spaces associated with Bessel operator \({\triangle _\lambda }\). To obtain this, we build up a variant of the technical lemma of Merryfield [67], which connects the product non-tangential maximal function and the area function, and we make good use of the generalised Cauchy–Riemann type equations (1.2) and establish the grand maximal function which connects the non-tangential and radial maximal functions. Then, as a direct consequence, we obtain the decomposition of \(\mathrm{BMO}_{\Delta _\lambda }\) via the Bessel Riesz transforms. We note that these results in the second part are first extensions for product Hardy and BMO spaces beyond the Chang–Fefferman setting on Euclidean spaces.

1.2 Statement of Main Results

Throughout the paper, for every interval \(I\subset {\mathbb {R}}_+\), we denote it by \(I:=I(x,t):= (x-t,x+t)\cap {\mathbb {R}}_+\). The measure of I is defined as \(m_\lambda (I(x,t)):=\int _{I(x,\,t)} y^{2\lambda } dy\). We observe that for any interval \(I:=I(x,r)\subset {\mathbb {R}}_+\), \(m_\lambda (I)\sim x^{2\lambda }r+r^{2\lambda +1}\); moreover, from [22] we have that for any \(I\subset {{{\mathbb {R}}}_+}\),

$$\begin{aligned} \min \big (2, 2^{2\lambda }\big )m_\lambda (I)\le m_\lambda (2I)\le 2^{2\lambda +1}m_\lambda (I), \end{aligned}$$
(1.3)

where for any \(k\in (0, \infty )\) and \(I:= I(x, r)\) for some x, \(r\in (0, \infty )\), \(kI:=I(x, kr)\). Thus, \(({\mathbb {R}}_+,|\cdot |,{dm_\lambda })\) is a space of homogeneous type.

In the product setting \({\mathbb {R}}_+\times {\mathbb {R}}_+\), we define

$$\begin{aligned} d\mu _\lambda (x_1, x_2):=dm_\lambda (x_1)\times dm_\lambda (x_2)\,\, {\mathrm{and}}\,\, {\mathbb {R}}_\lambda := ({\mathbb {R}}_+\times {\mathbb {R}}_+,d\mu _\lambda (x_1, x_2)). \end{aligned}$$

We work with the domain \(( {\mathbb {R}}_+\times {\mathbb {R}}_+) \times ( {\mathbb {R}}_+\times {\mathbb {R}}_+)\) and its distinguished boundary \({ {\mathbb {R}}_+\times {\mathbb {R}}_+}\). For \(x := (x_1,x_2)\in { {\mathbb {R}}_+\times {\mathbb {R}}_+}\), denote by \(\Gamma (x)\) the product cone \(\Gamma (x) := \Gamma _1(x_1)\times \Gamma _2(x_2)\), where for \(i= 1, 2\),

$$\begin{aligned} \Gamma _i(x_i) := \{(y_i,t_i)\in {\mathbb {R}}_+\times {\mathbb {R}}_+: |x_i-y_i| < t_i\}. \end{aligned}$$

We now give the definition of \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\), \(p\in ((2\lambda +1)/(2\lambda +2), 1]\) following the way in [16] using the Littlewood–Paley area functions via the operators \(\{t\sqrt{\triangle _\lambda }e^{-t\sqrt{\triangle _\lambda }}\}_{t>0}\). To be precise, given a function f on \(L^2({\mathbb {R}}_\lambda )\), the Littlewood–Paley area function Sf(x), \(x:=(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\), associated with the operator \(\Delta _\lambda \), is defined as

$$\begin{aligned} Sf(x)&:= \bigg (\iint _{\Gamma (x) }\Big | \sqrt{\triangle _\lambda }e^{-t_1\sqrt{\triangle _\lambda }}\, \sqrt{\triangle _\lambda }e^{-t_2\sqrt{\triangle _\lambda }}f(y_1,y_2)\Big |^2\nonumber \\&\qquad \times {t_1 t_2 d\mu _\lambda (y_1,y_2)dt_1 dt_2 \over m_\lambda (I(x_1,t_1)) m_\lambda (I(x_2,t_2))}\bigg )^{1\over 2}, \end{aligned}$$
(1.4)

where \(\sqrt{\triangle _\lambda }e^{-t_1\sqrt{\triangle _\lambda }}\) acts on the first variable and \(\sqrt{\triangle _\lambda }e^{-t_2\sqrt{\triangle _\lambda }}\) acts on the second variable of f, respectively.

We now define the product Hardy space \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) by using (1.4) as follows.

Definition 1.1

For \(p\in ((2\lambda +1)/(2\lambda +2), 1]\), the Hardy space \(H^p_{\Delta _\lambda }( {\mathbb {R}}_\lambda )\) is defined as the completion of

$$\begin{aligned} \{f\in L^2({\mathbb {R}}_\lambda ) : \Vert Sf\Vert _{L^p({\mathbb {R}}_\lambda )} < \infty \} \end{aligned}$$

with respect to the norm (quasi-norm) \( \Vert f\Vert _{H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda ) } := \Vert Sf \Vert _{L^p( {\mathbb {R}}_\lambda )}, \) where Sf is defined by (1.4).

We point out that the setting \({\mathbb {R}}_\lambda \) falls into the scope of a product space of homogeneous type, and moreover, for each t, the kernel of \(t\sqrt{\triangle _\lambda }e^{-t\sqrt{\triangle _\lambda }}\) turns out to satisfy the standard size, smoothness and cancellation conditions (see \(\mathrm{(K_i)}\)-\({\mathrm{(K_{iii}})}\) in Sect. 3.1 below) as the difference operator \(D_k\) in [43] where they defined a version of product Hardy space \(H^p({\mathbb {R}}_\lambda )\) via the discrete Littlewood–Paley–Stein square function \(S_d(f)\) (see Definitions 2.4 and 2.5), and obtained the atomic decomposition and dual space of \(H^p({\mathbb {R}}_\lambda )\). Moreover, \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda ) \) coincides with \(H^p({\mathbb {R}}_\lambda )\); see Theorem 3.5 below.

We now provide several definitions of \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\), \(p\in ((2\lambda +1)/(2\lambda +2), 1]\). This requires some additional notation, but the careful reader will notice that the spaces are distinguished notationally by a subscript to remind how they are defined.

We begin with the definition of another version of the Littlewood–Paley area function. Let

$$\begin{aligned} \nabla _{t_1,\,y_1}:=(\partial _{t_1}, \partial _{y_1}),\,\, \nabla _{t_2,\,y_2}:=(\partial _{t_2}, \partial _{y_2}). \end{aligned}$$

Then the Littlewood–Paley area function \(S_uf(x)\) for \(f \in L^2({\mathbb {R}}_\lambda )\), \(x:=(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\) is defined as

$$\begin{aligned} S_uf(x)&:= \bigg (\iint _{\Gamma (x) }\big | \nabla _{t_1,\,y_1} e^{-t_1\sqrt{{\triangle _\lambda }}}\nabla _{t_2,\,y_2}e^{-t_2\sqrt{{\triangle _\lambda }}}(f)(y_1,y_2) \big |^2 \nonumber \\&\qquad \times \frac{t_1t_2\ d\mu _\lambda (y_1,y_2) dt_1dt_2}{m_\lambda (I(x_1,t_1)) m_\lambda (I(x_2,t_2))}\bigg )^{1\over 2}. \end{aligned}$$
(1.5)

Then naturally we have the following definition of the product Hardy space via \(S_uf\).

Definition 1.2

For \(p\in ((2\lambda +1)/(2\lambda +2), 1]\), the Hardy space \({H^p_{S_u}({\mathbb {R}}_\lambda )}\) is defined as the completion of

$$\begin{aligned} \{f\in L^2({\mathbb {R}}_\lambda ) : \Vert S_uf\Vert _{L^p({\mathbb {R}}_\lambda )} < \infty \} \end{aligned}$$

with respect to the norm (quasi-norm) \( \Vert f\Vert _{H^{p}_{S_u}( {\mathbb {R}}_\lambda ) } := \Vert S_uf \Vert _{L^p({\mathbb {R}}_\lambda )}. \)

Next we define the product non-tangential and radial maximal functions via the heat semigroup and Poisson semigroup associated to \(\Delta _\lambda \). For all \(\alpha \in (0, \infty )\), \(f\in L^2({\mathbb {R}}_\lambda )\) and \(x_1,x_2\in {\mathbb {R}}_+\), let

$$\begin{aligned} {{\mathcal {N}}}^\alpha _{h}f(x_1, x_2)&:=\!\!\sup _{\genfrac{}{}{0.0pt}{}{|y_1-x_1|<\alpha t_1}{|y_2-x_2|<\alpha t_2}}\!\left|e^{-t_1^2{\triangle _\lambda }}e^{-t_2^2{\triangle _\lambda }} f(y_1, y_2)\right|,\\ {{\mathcal {N}}}^\alpha _{P}f(x_1, x_2)&:=\!\!\sup _{\genfrac{}{}{0.0pt}{}{|y_1-x_1|<\alpha t_1}{|y_2-x_2|<\alpha t_2}}\!\left|e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}} f(y_1, y_2)\right| \end{aligned}$$

be the product non-tangential maximal functions with aperture \(\alpha \) via the heat semigroup and Poisson semigroup associated to \(\Delta _\lambda \), respectively. We fix the aperture \(\alpha :=1\) and denote \({{\mathcal {N}}}^1_{h}f\) by \({{\mathcal {N}}}_{h}f\) and \({{\mathcal {N}}}^1_{P}f\) by \({{\mathcal {N}}}_{P}f\). Moreover let

$$\begin{aligned} {{\mathcal {R}}}_{h}f(x_1, x_2)&:=\sup _{t_1>0,\,t_2>0}\left|e^{-t^2_1{\triangle _\lambda }}e^{-t^2_2{\triangle _\lambda }} f(x_1, x_2)\right|,\\ {{\mathcal {R}}}_{P}f(x_1, x_2)&:=\sup _{t_1>0,\,t_2>0}\left|e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}} f(x_1,x_2)\right| \end{aligned}$$

be the product radial maximal functions via the heat semigroup and Poisson semigroup associated to \(\Delta _\lambda \), respectively.

Definition 1.3

The Hardy space \({H^p_{{{\mathcal {M}}}}({\mathbb {R}}_\lambda )}\), \(p\in ((2\lambda +1)/(2\lambda +2), 1]\), associated to the maximal function \({ {{\mathcal {M}}}}f\) is defined as the completion of the set

$$\begin{aligned} \{f\in L^2({\mathbb {R}}_\lambda ) : \Vert { {{\mathcal {M}}}}f\Vert _{L^p({\mathbb {R}}_\lambda )} < \infty \} \end{aligned}$$

with the norm (quasi-norm) \( \Vert f\Vert _{{H^p_{{{\mathcal {M}}}}({\mathbb {R}}_\lambda )}} := \Vert { {{\mathcal {M}}}}f \Vert _{L^p({\mathbb {R}}_\lambda )}. \) Here \({ {{\mathcal {M}}}}f\) is one of the following maximal functions: \({{\mathcal {N}}}_{h}f\), \({{\mathcal {N}}}_{P}f\), \({{\mathcal {R}}}_{h}f\) and \({{\mathcal {R}}}_{P}f\).

Our first main result of this paper is as follows.

Theorem 1.4

Let \(p\in ((2\lambda +1)/(2\lambda +2), 1]\). The product Hardy spaces \({H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\), \({H^p_{S_u}({\mathbb {R}}_\lambda )}\), \({H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )},\) \({H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\), \({H^p_{{{\mathcal {R}}}_{P}}({\mathbb {R}}_\lambda )}\) and \({H^p_{{{\mathcal {N}}}_{P}}({\mathbb {R}}_\lambda )}\) coincide and have equivalent norms (or quasi-norms).

Next we consider the definition of product Hardy space via the Bessel Riesz transforms \({R_{\Delta _\lambda ,\,1}}(f)\) and \({R_{\Delta _\lambda ,\,2}}(f)\) on the first and second variable, respectively. For the definition of Bessel Riesz transforms, we refer to (2.4) in Sect. 2.2.

Definition 1.5

The product Hardy space \({H^1_{Riesz}({\mathbb {R}}_\lambda )}\) is defined as the completion of

$$\begin{aligned}&\left\rbrace f\in {L^1({\mathbb {R}}_\lambda )}\cap L^2({\mathbb {R}}_\lambda ): \,\, {R_{\Delta _\lambda ,\,1}}f,\ {R_{\Delta _\lambda ,\,2}}f,\ {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}f\in {L^1({\mathbb {R}}_\lambda )}\right\lbrace \end{aligned}$$

endowed with the norm

$$\begin{aligned} \Vert f\Vert _{{H^1_{Riesz}({\mathbb {R}}_\lambda )}}&:=\Vert f\Vert _{L^1({\mathbb {R}}_\lambda )}+\Vert {R_{\Delta _\lambda ,\,1}}f\Vert _{L^1({\mathbb {R}}_\lambda )}+\Vert {R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}\\&\qquad +\Vert {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}. \end{aligned}$$

Then based on Theorem 1.4, the second main result of the paper is the following characterization of \(H^{1}_{\Delta _\lambda }( {\mathbb {R}}_\lambda ) \).

Theorem 1.6

The product Hardy spaces \(H^{1}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) and \(H^1_{Riesz}({\mathbb {R}}_\lambda )\) coincide.

As a consequence of Theorem 1.6, let \({\mathrm{BMO}}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) be the dual space of \(H^{1}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) introduced in [43, 46]. We directly have the decomposition of \({\mathrm{BMO}}_{\Delta _\lambda }( {\mathbb {R}}_\lambda )\), whose proof is similar to the classical setting and omitted.

Theorem 1.7

The following two statements are equivalent.

  1. (i)

    \(\varphi \in {\mathrm{BMO}}_{\Delta _\lambda }( {\mathbb {R}}_\lambda ) \);

  2. (ii)

    There exist \(g_i\in L^\infty ( {\mathbb {R}}_\lambda )\), \(i=1,2,3,4\), such that

    $$\begin{aligned} \varphi = g_1 + R_{\Delta _\lambda ,\,1}(g_2) + R_{\Delta _\lambda ,\,2}(g_3) + R_{\Delta _\lambda ,\,1}R_{\Delta _\lambda ,\,2}(g_4). \end{aligned}$$

Moreover, we also characterize \(H^{p}_{\Delta _\lambda }( {\mathbb {R}}_\lambda ) \) for \( p\in ((2\lambda +1)/(2\lambda +2), 1)\) via Bessel Riesz transform in a slightly different form. A distribution \(f\in \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\big )^{'}\) is said to be restricted at infinity, if for any \(r>0\) large enough, \(e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}} f\in {L^r({\mathbb {R}}_\lambda )}\) (for the notation and details of this distribution space, we refer to Definition 2.3 below). By Theorem 1.4 and an argument as in [75, pp. 100-101], we see that for any \(f\in H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) with \(p\in ((2\lambda +1)/(2\lambda +2), 1]\), \(e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}} f\in {L^r({\mathbb {R}}_\lambda )}\) for all \(r\in [p, \infty ]\).

Theorem 1.8

Let \(p\in ((2\lambda +1)/(2\lambda +2), 1)\) and \(f\in \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\big )^{'}\) be restricted at infinity. Then \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\) if and only if there exists a positive constant C such that for all \(t_1,\,t_2\in (0, \infty )\),

$$\begin{aligned}&\big \Vert e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}}(f)\big \Vert _{L^p({\mathbb {R}}_\lambda )}+\big \Vert {R_{\Delta _\lambda ,\,1}}\big (e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}}(f)\big )\big \Vert _{L^p({\mathbb {R}}_\lambda )}\nonumber \\&\qquad +\big \Vert {R_{\Delta _\lambda ,\,2}}\big (e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}}(f)\big ) \big \Vert _{L^p({\mathbb {R}}_\lambda )}+\big \Vert {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}\big ( e^{-t_1\sqrt{{\triangle _\lambda }}}e^{-t_2\sqrt{{\triangle _\lambda }}}(f) \big )\big \Vert _{L^p({\mathbb {R}}_\lambda )}\nonumber \\&\quad \le C. \end{aligned}$$
(1.6)

1.3 Structure and Main Methods of this Paper

In Sect. 2, we first recall the space of test functions and distributions on the product space \({\mathbb {R}}_\lambda \) and the Hardy space \(H^p({\mathbb {R}}_\lambda )\) in [43] in terms of the discrete Littlewood–Paley–Stein square function \(S_d(f)\). We also recall the Poisson kernel and conjugate Poisson kernel in the Bessel setting \(({\mathbb {R}}_+,|\cdot |,{dm_\lambda })\).

In Sect. 3, we provide the \(L^p\)-boundedness (\(1<p<\infty \)) of the product Littlewood–Paley area functions Sf and \(S_uf\) as defined in (1.4) and (1.5), respectively. In fact, we will prove this result for more general Littlewood–Paley area functions and g-functions with the kernels of the operators inside satisfying certain size, smoothness and cancellation conditions which covers both the Bessel Poisson kernel and Bessel heat kernel; see Sect. 3.1. The main approach we use here is Calderón’s reproducing formula, almost orthogonality estimates and the Plancherel–Pólya type inequalities in the product setting. Then we show that \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda ) \) coincides with \(H^p({\mathbb {R}}_\lambda )\); see Theorems 3.1 and 3.4.

In Sect. 4, we present the proof of Theorem 1.4 by showing the following inequalities

$$\begin{aligned} \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}&\le \Vert f\Vert _{{H^p_{S_u}({\mathbb {R}}_\lambda )}} \lesssim \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{P}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{R_{P}}({\mathbb {R}}_\lambda )} \nonumber \\&\lesssim \Vert f\Vert _{H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}, \end{aligned}$$
(1.7)

i. e., all the norms above are equivalent.

Here the first inequality follows directly by definition. The fourth inequality follows from the well-known subordination formula which connects the Poisson kernel to the heat kernel. The fifth inequality follows directly from definition. The last inequality follows from the result that \({H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\) coincides with the known \(H^p({\mathbb {R}}_\lambda )\) (Theorem 3.5) and the fact that \(H^p({\mathbb {R}}_\lambda )\) has atomic decomposition. The main difficulties here are in the proofs of the second and third inequalities.

To prove the second inequality, we first point out that, to the best of our knowledge, the only one way up to now, to pass from the Littlewood–Paley area function to the non-tangential maximal function in the classical product setting is due to Merryfield [67]. The main technique in [67] relies on the construction of the function \(\psi \) in \(C_c^\infty ({\mathbb {R}})\) according to any given \(\phi \in C_c^\infty ({\mathbb {R}})\) with certain conditions, satisfying that for any \(f\in L^2({{\mathbb {R}}})\),

$$\begin{aligned} \partial _t\left( \frac{1}{t}\phi \left( \frac{\cdot }{t}\right) *f (x)\right) =\partial _x \left( \frac{1}{t}\psi \left( \frac{\cdot }{t}\right) *f(x)\right) , \end{aligned}$$

which is one of the Cauchy–Riemann equations in the classical setting.

Based on the idea above, suppose \(\phi \in C^\infty _c({\mathbb {R}}_+)\) such that \(\phi \ge 0\), \(\,{\mathrm {supp}\,}(\phi )\subset (0, 1)\), and \({\int _0^\infty }\phi (x)\,{dm_\lambda }(x)=1\), we construct a function \(\psi (t, x, y)\) defined on \({\mathbb {R}}_+\times {\mathbb {R}}_+ \times {\mathbb {R}}_+\) by solving the following equation:

$$\begin{aligned} {\partial }_t (\phi _t{\sharp _\lambda }f)(x)= {\partial }_x[\psi (f)(t,x)]+\frac{2\lambda }{x}\psi (f)(t,x), \end{aligned}$$

where \(\phi _t(x):=t^{-2\lambda -1}\phi (x/t)\) is the dilation of \(\phi \) in the Bessel setting and

$$\begin{aligned} \psi (f)(t,x):=\int _{{\mathbb {R}}_+} \psi (t, x, y) f(y){dm_\lambda }(y). \end{aligned}$$

Moreover, we show that \(\psi (t, x, y)\) satisfies the required size, smoothness and cancellation conditions, and especially the support condition: \(\,{\mathrm {supp}\,}\psi (t, x, y)\subset \{t,x,y\in {\mathbb {R}}_+:\, |x-y|<t\}\). Note that \(\psi (f)\) here is no longer a Hankel convolution (for notation and details, we refer to Lemma 4.1).

To prove the third inequality, we borrow an idea from [81] in the one-parameter setting (see also [37, 38]), to establish a product grand maximal function, which controls the non-tangential maximal function. Then, by using the Calderón reproducing formula and almost orthogonality estimates, we obtain that the \(L^p({\mathbb {R}}_\lambda )\) norm of the grand maximal function is bounded by that of the radial maximal function for \({2\lambda +1\over 2\lambda +2}<p\le 1\).

In Sect. 5, we prove our second main result, the Bessel Riesz transform characterizations of \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) (Theorems 1.6 and  1.8). The main ideas in the one-parameter setting are from Fefferman–Stein [34] (see also [75, Chapter III, Section 4.2] and [5]), where they obtained the result by studying the boundary value of the corresponding harmonic function and its conjugate. In our product setting, we consider the Bessel bi-harmonic function \(u(t_1,t_2,x_1,x_2)\) and its three conjugate bi-harmonic functions vwz such that (uv) and (wz) satisfy the generalised Cauchy–Riemann equations (1.2) in the first group of variables \((t_1,x_1)\), and that (uw) and (vz) in the second group of variables \((t_2,x_2)\). Then, using [68, Lemma 11], we obtain the harmonic majorants of the following four functions \(\{u^2+v^2\}^{p\over 2}\), \(\{w^2+z^2\}^{p\over 2}\), \(\{u^2+w^2\}^{p\over 2}\), \(\{v^2+z^2\}^{p\over 2}\) corresponding to the four groups of Cauchy–Riemann equations above respectively. By iteration, we obtain the harmonic majorant of the bi-harmonic function \(\{u^2+v^2+w^2+z^2\}^{p\over 2}\). Then, our main result follows from the properties of the Poisson semigroup \(\{e^{-t\sqrt{\Delta _\lambda }}\}_{t>0}\) and the standard approach ([75, Chapter III, Section 4.2]). To the best of our knowledge, the Hilbert transform characterizations have not been addressed before for the classical Chang–Fefferman product Hardy space \(H^p({\mathbb {R}}\times {\mathbb {R}})\) when \(p<1\). We note that when \(\lambda =0\), our result and proof apply to \(H^p({\mathbb {R}}\times {\mathbb {R}})\) with minor modifications, and hence provide the characterizations of \(H^p({\mathbb {R}}\times {\mathbb {R}})\) via Hilbert transforms.

We remark that by using the standard asymptotics for the Bessel function and a recurrence relation for Bessel function, Castro and Szarek [10] obtained a representation of Bessel heat kernel, and showed that many harmonic analysis operators in the Bessel setting are Calderón–Zygmund operators for all \(\lambda \in (-\frac{1}{2}, \infty )\), which extends results in [2] under the restriction \(\lambda \in [0, \infty )\). In this paper, under the assumption \(\lambda >0\), we apply known properties of Hankel convolution in [5, 40], and a basic lemma in [68], concerning sub-harmonicity in the Bessel setting, see (4.4) and Lemma 5.1. It is unknown if the range of \(\lambda \) can be improved. We also point out that the assumption \(p> (2\lambda +1)/(2\lambda +2)\) is necessary, see step 6 in the proof of Theorem 1.4 below.

Throughout the whole paper, we denote by C and \({\widetilde{C}}\) positive constants which are independent of the main parameters, but they may vary from line to line. Constants with subscripts, such as \(C_0\) and \(A_1\), do not change in different occurrences. For every \(p\in (1, \infty )\), we denote by \(p'\) the conjugate of p, i. e., \(\frac{1}{p'}+\frac{1}{p}=1\). If \(f\le Cg\), we then write \(f\lesssim g\) or \(g\gtrsim f\); and if \(f \lesssim g\lesssim f\), we write \(f\sim g.\)

2 Preliminaries

In this section, we first apply the known results on product spaces of homogeneous type developed in [43, 45] to our setting on \({\mathbb {R}}_\lambda \). We then recall the properties of the Poisson kernels and conjugate Poisson kernels in the Bessel setting.

2.1 Product Hardy and BMO Spaces on Spaces of Homogeneous Type

To begin with, we point out that in [43, 45], the authors considered the general setting of product spaces of homogeneous type \({\mathbb {X}}:= (X_1,d_1,\mu _1)\times (X_2,d_2,\mu _2)\), where for \(i=1,2\), \((X_i,d_i,\mu _i)\) is a space of homogeneous type, and developed the test function spaces and distribution spaces, Calderón’s reproducing formula, Littlewood–Paley theory, product Hardy spaces and atomic decompositions. For notational simplicity, we now apply all these results to our setting, i. e.,

$$\begin{aligned} {\mathbb {X}}:={\mathbb {R}}_\lambda :=({\mathbb {R}}_+,|\cdot |,{dm_\lambda })\times ({\mathbb {R}}_+,|\cdot |,{dm_\lambda }). \end{aligned}$$

We first recall the definition of approximation to the identity.

Definition 2.1

We say that a family of operators \(\{S_k\}_{k\in {\mathbb {Z}}}\) on \({\mathbb {R}}_+\) is an approximation to the identity if \(\lim _{k\rightarrow +\infty } S_k=Id\), \(\lim _{k\rightarrow -\infty } S_k=0\) and moreover, the kernel \(S_k(x,y)\) of \(S_k\) satisfies the following condition: for \(\beta ,\gamma \in (0,1]\),

\({\mathrm{(A_i)}}\):

for any \(x,\,y \in {\mathbb {R}}_+\) and \(k \in {\mathbb {Z}}\),

$$\begin{aligned} |S_k(x, y)|&\lesssim \frac{1}{m_\lambda (I(x, 2^{-k}))+m_\lambda (I(y,2^{-k}))+m_\lambda (I(x, |x-y|))}\\&\quad \bigg (\frac{2^{-k}}{|x-y|+2^{-k}}\bigg )^\gamma ; \end{aligned}$$
\({\mathrm{(A_{ii}})}\):

for any \(x,y,\widetilde{y}\in {\mathbb {R}}_+\) and \(k \in {\mathbb {Z}}\) with \(|y-\widetilde{y}|\le (2^{-k}+|x-y|)/2\),

$$\begin{aligned}&|S_k(x, y)-S_k(x,\widetilde{y})|+ |S_k(y,x)-S_k(\widetilde{y},x)|\\&\quad \lesssim \frac{1}{m_\lambda (I(x, 2^{-k}))+m_\lambda (I(y,2^{-k}))+m_\lambda (I(x, |x-y|))}\bigg (\frac{|y-\widetilde{y}|}{|x-y|+2^{-k}}\bigg )^{\beta }\\&\qquad \bigg (\frac{2^{-k}}{|x-y|+2^{-k}}\bigg )^\gamma ; \end{aligned}$$
\({\mathrm{(A_{iii}})}\):

for any \(x,\,y,\,\widetilde{x},\,\widetilde{y}\in {\mathbb {R}}_+\) and \(k\in {\mathbb {Z}}\) with \(|x-\widetilde{x}|\), \(|y-\widetilde{y}|\le (2^{-k}+|x-y|)/3\),

$$\begin{aligned}&|S_k(x, y)-S_k(x,\widetilde{y})|+|S_k(\widetilde{x},y)-S_k(\widetilde{x},\widetilde{y})|\\&\quad \lesssim \frac{1}{m_\lambda (I(x, 2^{-k}))+m_\lambda (I(y,2^{-k}))+m_\lambda (I(x, |x-y|))}\\&\qquad \times \bigg (\frac{|x-\widetilde{x}|}{|x-y|+2^{-k}}\bigg )^{\beta }\bigg (\frac{|y-\widetilde{y}|}{|x-y|+2^{-k}}\bigg )^{\beta }\bigg (\frac{2^{-k}}{|x-y|+2^{-k}}\bigg )^\gamma ; \end{aligned}$$
\({\mathrm{(A_{iv}})}\):

for any \(x\in {\mathbb {R}}_+\) and \(k \in {\mathbb {Z}}\),

$$\begin{aligned} {\int _0^\infty }S_k(x, y){\,dm_\lambda (y)}={\int _0^\infty }S_k(y,x){\,dm_\lambda (y)}=1. \end{aligned}$$

One of the constructions of an approximation to the identity is due to Coifman, see [19]. We set \(D_k:=S_k-S_{k-1}\). And it is obvious that the kernel \(D_k(x,y)\) of \(D_k\) satisfies \({\mathrm{(A_i)}}\), \({\mathrm{(A_{ii}}) }\) and \({\mathrm{(A_{iii}})}\) with \(S_k(x,y)\) replaced by \(D_k(x,y)\), and

$$\begin{aligned} {\int _0^\infty }D_k(x, y){\,dm_\lambda (y)}={\int _0^\infty }D_k(y,x){\,dm_\lambda (y)}=0 \end{aligned}$$

for any \(x\in {\mathbb {R}}_+\) and \(k \in {\mathbb {Z}}\).

We now recall the test function spaces and distribution spaces, and the one-parameter version of which was defined by Han, Müller and Yang [47, 48], and then the product version by Han, Li and Lu [43].

Definition 2.2

([47]) Consider the space \(({\mathbb {R}}_+,|\cdot |,{dm_\lambda })\). Let \(0<\gamma , \beta \le 1\) and \(r>0.\) A function f defined on \({\mathbb {R}}_+\) is said to be a test function of type \((x_0,r,\beta ,\gamma )\) centered at \(x_0\in {\mathbb {R}}_+\) if f satisfies the following conditions:

  1. (i)

    \(|f(x)|\le C \frac{\displaystyle 1}{\displaystyle m_\lambda (I(x_0,r))+m_\lambda (I(x_0,|x_0-x|))} \Big (\frac{\displaystyle r}{\displaystyle r+|x-x_0|}\Big )^{\gamma }\);

  2. (ii)

    \(|f(x)-f(y)|\le C \Big (\frac{\displaystyle |x-y|}{\displaystyle r+|x-x_0|}\Big )^{\beta } \frac{\displaystyle 1}{\displaystyle m_\lambda (I(x_0,r))+m_\lambda (I(x_0,|x_0-x|))} \Big (\frac{\displaystyle r}{\displaystyle r+|x-x_0|}\Big )^{\gamma }\) for all \(x,y\in {\mathbb {R}}_+\) with \(|x-y|\le {\frac{1}{2}}(r+|x-x_0|).\)

If f is a test function of type \((x_0,r,\beta ,\gamma )\), we write \(f\in {{\mathcal {G}}}(x_0,r,\beta ,\gamma )\) and the norm of \(f\in {{\mathcal {G}}}(x_0,r,\beta ,\gamma )\) is defined by \(\Vert f\Vert _{{{\mathcal {G}}}(x_0,\,r,\,\beta ,\,\gamma )}:=\inf \{C>0: \mathrm{(i)\ and\ (ii)\ hold}\}.\)

Now for any fixed \(x_0\in {\mathbb {R}}_+\), we denote \({{\mathcal {G}}}(\beta ,\gamma ):={{\mathcal {G}}}(x_0,1,\beta ,\gamma )\) and by \({{\mathcal {G}}}_0(\beta ,\gamma )\) the collection of all test functions in \({{\mathcal {G}}}(\beta ,\gamma )\) with \(\int _{{\mathbb {R}}_+} f(x) {dm_\lambda }(x)=0.\) Note that \({{\mathcal {G}}}(x_1,r,\beta ,\gamma )={{\mathcal {G}}}(\beta ,\gamma )\) with equivalent norms for all \(x_1\in {\mathbb {R}}_+\) and \(r>0\) and that \({{\mathcal {G}}}(\beta ,\gamma )\) is a Banach space with respect to the norm in \({{\mathcal {G}}}(\beta ,\gamma )\).

Let \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\) be the completion of the space \({{\mathcal {G}}}_0(1,1)\) in the norm of \({{\mathcal {G}}}(\beta ,\gamma )\) when \(0<\beta ,\gamma <1\). If \(f\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\), we then define \(\Vert f\Vert _{{\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )}:=\Vert f\Vert _{{{\mathcal {G}}}(\beta ,\gamma )}\). \(({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma ))'\), the distribution space, is defined to be the set of all linear functionals L from \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\) to \({\mathbb {C}}\) with the property that there exists \(C\ge 0\) such that for all \(f\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\), \(|L(f)|\le C\Vert f\Vert _{{\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )}.\)

Now we return to the product setting and recall the space of test functions and distributions on the product space \({\mathbb {R}}_\lambda \).

Definition 2.3

([43]) Let \((x_0,y_0)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\), \(0<\gamma _1,\gamma _2,\beta _1,\beta _2\le 1\) and \(r_1, r_2>0.\) A function f(xy) defined on \({\mathbb {R}}_\lambda \) is said to be a test function of type \((x_0,y_0;r_1,r_2;\beta _1,\beta _2;\gamma _1,\gamma _2)\) if for any fixed \(y\in {\mathbb {R}}_+,\) f(xy),  as a function of the variable of x,  is a test function in \({{\mathcal {G}}}(x_0,r_1,\beta _1,\gamma _1)\) on \({\mathbb {R}}_+.\) Moreover, the following conditions are satisfied:

  1. (i)

    \(\Vert f(\cdot ,y)\Vert _{{{\mathcal {G}}}(x_0,\,r_1,\,\beta _1,\,\gamma _1)}\le \frac{\displaystyle C}{\displaystyle m_\lambda (I(y_0,r_2))+m_\lambda (I(y_0,|y_0-y|))} \Big (\frac{\displaystyle r_2}{\displaystyle r_2+|y_0-y|}\Big )^{\gamma };\)

  2. (ii)

    \(\Vert f(\cdot ,y)-f(\cdot ,\widetilde{y})\Vert _{{{\mathcal {G}}}(x_0,\,r_1,\,\beta _1,\,\gamma _1)}\)

    $$\begin{aligned} \le \frac{\displaystyle C}{\displaystyle m_\lambda (I(y_0,r_2))+m_\lambda (I(y_0,|y_0-y|))} \Big (\frac{\displaystyle r_2}{\displaystyle r_2+|y_0-y|}\Big )^{\gamma } \Big (\frac{\displaystyle |y-\widetilde{y}|}{\displaystyle r_2+|y-y_0|}\Big )^{\beta _2} \end{aligned}$$

    for all \(y,\widetilde{y}\in {\mathbb {R}}_+\) with \(|y-\widetilde{y}|\le (r_2+|y-y_0|)/2\).

Similarly, for any fixed \(x\in {\mathbb {R}}_+,\) f(xy),  as a function of the variable of y,  is a test function in \({{\mathcal {G}}}(y_0,r_2,\beta _2,\gamma _2)\) on \({\mathbb {R}}_+\), and both properties (i) and (ii) also hold with xy interchanged.

If f is a test function of type \((x_0,y_0;r_1,r_2;\beta _1,\beta _2;\gamma _1,\gamma _2)\), we write \(f\in {{\mathcal {G}}}(x_{0},y_{0};r_{1},r_{2};\beta _{1},\beta _{2};\) \(\gamma _{1},\gamma _{2})\) and the norm of f is defined by

$$\begin{aligned} \Vert f\Vert _{{{\mathcal {G}}}(x_{0},\,y_{0};\,r_{1},\,r_{2};\,\beta _{1},\,\beta _{2}; \,\gamma _{1},\,\gamma _{2})}&:=\inf \{C:\ C\, {{{\mathrm{satisfies}}}}\, {\mathrm{(i)\, and\ (ii)\ for}}\, f(\cdot , y)\in {{\mathcal {G}}}(x_0,r_1,\beta _1,\gamma _1)\,\\&\quad {{\mathrm{and}}}\, f(x, \cdot )\in {{\mathcal {G}}}(y_0,r_2,\beta _2,\gamma _2)\}. \end{aligned}$$

We denote by \({{\mathcal {G}}}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) the class \({{\mathcal {G}}}(x_{0},y_{0};1,1;\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) for any fixed \((x_{0},y_{0})\in {\mathbb {R}}_+\times {\mathbb {R}}_+.\) Say that \(f(x,y)\in {{\mathcal {G}}}_0(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) if

$$\begin{aligned} \int _{{\mathbb {R}}_+}f(x,y)dm_\lambda (x)=\int _{{\mathbb {R}}_+}f(x,y)dm_\lambda (y)=0. \end{aligned}$$

Note that \({{\mathcal {G}}}(x_{0},y_{0};r_{1},r_{2};\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})= {{\mathcal {G}}}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) with equivalent norms for all \((x_{0},y_{0})\in {\mathbb {R}}_+\times {\mathbb {R}}_+\) and \(r_1,r_2>0\) and that \({{\mathcal {G}}}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) is a Banach space with respect to the norm in \({{\mathcal {G}}}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\).

Let \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\) be the completion of the space \({{\mathcal {G}}}_0(1,1;1,1)\) in \({{\mathcal {G}}}(\beta _1,\beta _2;\gamma _1,\gamma _2)\) with \(0<\beta _i,\gamma _i<1\). If \(f\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2}) \), we then define \(\Vert f\Vert _{{\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\,\beta _{2};\,\gamma _{1},\,\gamma _{2})}:=\Vert f\Vert _{{{\mathcal {G}}}(\beta _{1},\,\beta _{2};\,\gamma _{1},\,\gamma _{2})}\).

We define the distribution space \(\big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\big )^{'}\) by all linear functionals L from the space \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\) to \({\mathbb {C}}\) with the property that there exists \(C\ge 0\) such that for all \(f\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\beta _{2};\gamma _{1},\gamma _{2})\), \(|L(f)|\le C \Vert f\Vert _{{\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _{1},\,\beta _{2};\,\gamma _{1},\,\gamma _{2})}.\)

We now recall the Hardy space \(H^p({\mathbb {R}}_\lambda )\) in [43] in terms of the discrete Littlewood–Paley–Stein square function \(S_d(f)\) via a system of “dyadic cubes” in spaces of homogeneous type. We mention that in our current setting, we take the classical dyadic intervals as our dyadic system. That is, for each \(k\in {\mathbb {Z}}\),

$$\begin{aligned} {\mathscr {X}}^k:=\{\tau :\,\,I^k_\tau :=(\tau 2^k, (\tau +1)2^k]\}_{\tau \in {\mathbb {Z}}_+}, \end{aligned}$$

where \({\mathbb {Z}}_+:={\mathbb {N}}\cup \{0\}\).

Definition 2.4

([43]) For \(i = 1\), 2, let \(\{S^{(i)}_{k_i}\}_{k_i\in {\mathbb {Z}}}\) be an approximation to the identity on \({\mathbb {R}}_+\), and let \(D^{(i)}_{k_i} := S^{(i)}_{k_i}-S^{(i)}_{k_i-1}\). For \(f\in \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\big )'\) with \(0< \beta _i, \gamma _i < \theta _i\) for \(i = 1\), 2, the discrete Littlewood–Paley–Stein square function \( S_d(f)\) of f is defined by

$$\begin{aligned} S_d(f)(x,y) :=\bigg \{ \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} \big |D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}})\big |^2 \chi _{I_{\alpha _1}^{k_1}}(x)\chi _{I_{\alpha _2}^{k_2}}(y)\bigg \}^{1\over 2}, \end{aligned}$$

where \(x_{I_{\alpha _i}^{k_i}}\) is the center of of the dyadic interval \(I_{\alpha _i}^{k_i}\) for \(i=1,2\), and \(N_1\) and \(N_2\) are two large fixed positive numbers.

By [43, Theorem 2.14 and Proposition 2.16], we then see that for \(1<p<\infty \) and \(f\in L^p({\mathbb {R}}_\lambda )\),

$$\begin{aligned} \big \Vert S_d(f) \big \Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{L^p({\mathbb {R}}_\lambda )}. \end{aligned}$$
(2.1)

Moreover, one can defines the Hardy space in terms of \(S_d(f)\) as follows.

Definition 2.5

([43]) Suppose \(\frac{ 2\lambda +1}{ 2\lambda +2} <p\le 1\) and \(0< \beta _i, \gamma _i <1\) for \(i = 1\), 2. The Hardy space \(H^p({\mathbb {R}}_\lambda )\) is defined by

$$\begin{aligned} H^p({\mathbb {R}}_\lambda ) :=\big \lbrace f \in \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\big )': S_d(f)\in L^p({\mathbb {R}}_\lambda )\big \rbrace , \end{aligned}$$

with the norm (quasi-norm) \(\Vert f\Vert _{H^p({\mathbb {R}}_\lambda )} := \Vert S_d(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\).

We recall the atomic decomposition for \(H^p({\mathbb {R}}_\lambda )\) in [45]. We call \(R:=I_{\tau _1}^{k_1}\times I_{\tau _2}^{k_2}\) a dyadic rectangle in \({\mathbb {R}}_+\times {\mathbb {R}}_+\), where for \(i=1,2\), \(I_{\tau _i}^{k_i}\) is a dyadic interval. Let \(\Omega \subset {\mathbb {R}}_+\times {\mathbb {R}}_+\) be an open set of finite measure and \(m_i(\Omega )\) denote the family of dyadic rectangles \(R\subset \Omega \) which are maximal in the ith “direction”, \(i=1,2\). Also we denote by \(m(\Omega )\) the set of all maximal dyadic rectangles contained in \(\Omega \).

Definition 2.6

([45]) Suppose \(\frac{ 2\lambda +1}{ 2\lambda +2} <p\le 1\). A function \(a(x_1,x_2)\) defined on \( {\mathbb {R}}_+\times {\mathbb {R}}_+ \) is called an atom of \(H^p( {\mathbb {R}}_\lambda )\) if \(a(x_1,x_2)\) satisfies:

  1. (1)

    supp \(a\subset \Omega \), where \(\Omega \) is an open set of \( {\mathbb {R}}_+\times {\mathbb {R}}_+ \) with finite measure;

  2. (2)

    \(\Vert a\Vert _{L^2({\mathbb {R}}_\lambda )}\le \mu _\lambda (\Omega )^{1/2-1/p}\);

  3. (3)

    a can be further decomposed into rectangular atoms \(a_R\) associated to dyadic rectangle \(R:=I_1\times I_2\), satisfying the following

    1. (i)

      there exist two constants \({{\bar{C}}}_1\) and \({{\bar{C}}}_2\) such that supp \(a_R\subset {{\bar{C}}}_1I_1\times {{\bar{C}}}_2I_2\);

    2. (ii)

      \(\int _{{\mathbb {R}}_+}a_R(x_1,x_2){dm_\lambda }(x_1)=0\) for a.e. \(x_2\in {\mathbb {R}}_+\) and \(\int _{{\mathbb {R}}_+}a_R(x_1,x_2){dm_\lambda }(x_2)=0\) for a.e. \(x_1\in {\mathbb {R}}_+\);

    3. (iii)

      \(a=\sum \limits _{R\in m(\Omega )}a_R\) and \( \Big (\sum \limits _{R\in m(\Omega )}\Vert a_R\Vert _{L^2({\mathbb {R}}_\lambda )}^2\Big )^{1/2} \le \mu _\lambda (\Omega )^{1/2-1/p}\).

Proposition 2.7

([45]) Suppose \(\frac{ 2\lambda +1}{ 2\lambda +2} <p\le 1\). Then \(f\in L^2( {\mathbb {R}}_\lambda )\cap H^p( {\mathbb {R}}_\lambda )\) if and only if f has an atomic decomposition; that is,

$$\begin{aligned} f=\sum _{i=-\infty }^\infty \lambda _ia_i, \end{aligned}$$

in the sense of both \(H^p({\mathbb {R}}_\lambda )\) and \(L^2( {\mathbb {R}}_\lambda )\), where \(a_i\) are atoms and \(\sum _i|\lambda _i|^p<\infty .\) Moreover,

$$\begin{aligned} \Vert f\Vert _{H^p( {\mathbb {R}}_\lambda )} \sim \inf \left\{ \sum _{i=-\infty }^{\infty }|\lambda _i |^p \right\} ^{\frac{1}{p}}, \end{aligned}$$

where the infimum is taken over all decompositions as above and the implicit constants are independent of the \(L^2( {\mathbb {R}}_\lambda )\) and \(H^p( {\mathbb {R}}_\lambda )\) norms of f.

2.2 Poisson Kernel and Conjugate Poisson Kernel in the Bessel Setting \(({\mathbb {R}}_+,|\cdot |,{dm_\lambda })\)

Recall that \({P^{[\lambda ]}_t}(f):= e^{-t\sqrt{\triangle _\lambda }}f=P^{[\lambda ]}_t\sharp _\lambda f\) and \(W^{[\lambda ]}_t(f):= e^{-t{\triangle _\lambda }}f= W^{[\lambda ]}_{\sqrt{2t}}\sharp _\lambda f\), where

$$\begin{aligned} P^{[\lambda ]}(x):=\frac{2\lambda {\Gamma }(\lambda )}{{\Gamma }(\lambda +1/2)\sqrt{\pi }} \frac{1}{(1+x^2)^{\lambda +1}} \end{aligned}$$

and \(W^{[\lambda ]}(x):=2^{(1-2\lambda )/2} \exp \left(-x^2/2\right)/{\Gamma }(\lambda +1/2)\) and \(f\in {L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\). For f and \(\varphi \in {L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\), their Hankel convolution is defined by setting, for all \(x,\,t\in (0,\infty )\),

$$\begin{aligned} \Phi _{t,\,\lambda }f(x):=\varphi _t\sharp _\lambda f(x):=\displaystyle \int _0^\infty f(y){\tau ^{[\lambda ]}_x}\varphi _t(y){dm_\lambda }(y), \end{aligned}$$

where for \(t,\,x\in (0, \infty )\), \(\varphi _t(y):=t^{-2\lambda -1}\varphi (y/t)\) and \({\tau ^{[\lambda ]}_x}\varphi _t(y)\) denotes the Hankel translation of \(\varphi _t(y)\), that is,

$$\begin{aligned} {\tau ^{[\lambda ]}_x}\varphi _t(y):=c_\lambda \displaystyle \int _0^\pi \varphi _t\left(\sqrt{x^2+y^2-2xy\cos \theta }\right)(\sin \theta )^{2\lambda -1}\,d\theta \end{aligned}$$
(2.2)

with \(c_\lambda :=\frac{{\Gamma }(\lambda +1/2)}{{\Gamma }(\lambda )\sqrt{\pi }}\), see [5, pp. 200-201] or [40].

Moreover, we recall that \(\{e^{-t\Delta _\lambda }\}_{t>0}\) or \(\{e^{-t\sqrt{\Delta _\lambda }}\}_{t>0}\) have the following properties; see [5, 79, 80].

Lemma 2.8

Let \(\{T_t\}_{t>0}\) be one of \(\{e^{-t\Delta _\lambda }\}_{t>0}\) or \(\{e^{-t\sqrt{\Delta _\lambda }}\}_{t>0}\). Then \(\{T_t\}_{t>0}\) is a symmetric diffusion semigroup satisfying that \(T_tT_s=T_sT_t\) for any \(t,\,s\in (0, \infty )\), \(\lim _{t\rightarrow 0}T_tf=f\) in \({L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) and

\({\mathrm{(S_i)}}\):

\(\Vert T_tf\Vert _{{L^p({{\mathbb {R}}}_+,\, dm_\lambda )}}\le \Vert f\Vert _{{L^p({{\mathbb {R}}}_+,\, dm_\lambda )}}\) for all \(p\in [1, \infty ]\) and \(t\in (0,\,\infty )\);

\({\mathrm{(S_{ii}})}\):

\(T_t f\ge 0\) for all \(f\ge 0\) and \(t\in (0, \infty )\);

\({\mathrm{(S_{iii}})}\):

\(T_t(1)=1\) for all \(t\in (0,\,\infty )\).

Next we recall the definitions of the Poisson kernel and conjugate Poisson kernel. For any \(t,\,x,\,y\in (0, \infty )\),

$$\begin{aligned} P^{[\lambda ]}_tf(x):={\int _0^\infty }P^{[\lambda ]}_t(x,y)f(y)y^{2\lambda }\,dy, \end{aligned}$$

where

$$\begin{aligned} P^{[\lambda ]}_t(x,y)=\frac{2\lambda t}{\pi }\int _0^\pi \frac{(\sin \theta )^{2\lambda -1}}{(x^2+y^2 +t^2-2xy\cos \theta )^{\lambda +1}}\,d\theta . \end{aligned}$$

See [5, 78].

If \(f\in {L^p({{\mathbb {R}}}_+,\, dm_\lambda )}\), \(p\in [1, \infty )\), the \(\Delta _\lambda \)-conjugate of f is defined by setting, for any \(t,\,x,\,y\in (0, \infty )\),

$$\begin{aligned} {Q^{[\lambda ]}_t}(f)(x):=\int _0^\infty {Q^{[\lambda ]}_t}(x, y)f(y)\, dm_\lambda (y), \end{aligned}$$
(2.3)

where

$$\begin{aligned} {Q^{[\lambda ]}_t}(x, y):=-\displaystyle \frac{2\lambda }{\pi }\displaystyle \int _0^\pi \displaystyle \frac{(x-y\cos \theta )(\sin \theta )^{2\lambda -1}}{(x^2+y^2+t^2-2xy\cos \theta )^{\lambda +1}}\,d\theta ; \end{aligned}$$

see [68, p. 84]. We point out that there exists the boundary value function \(\lim _{t\rightarrow 0}{Q^{[\lambda ]}_t}(f)(x)\) for almost every \(x\in (0, \infty )\) (see [68, p. 84]), which is defined to be the Riesz transform \({R_{\Delta _\lambda }}(f)\), i.e.,

$$\begin{aligned} {R_{\Delta _\lambda }}(f)(x)&:=\lim _{t\rightarrow 0}{Q^{[\lambda ]}_t}(f)(x) \nonumber \\&= \int _{{\mathbb {R}}_+} -\displaystyle \frac{2\lambda }{\pi }\displaystyle \int _0^\pi \displaystyle \frac{(x-y\cos \theta )(\sin \theta )^{2\lambda -1}}{(x^2+y^2-2xy\cos \theta )^{\lambda +1}}\,d\theta \ f(y) {dm_\lambda }(y). \end{aligned}$$
(2.4)

Moreover, we note that \(u(t,x):={P^{[\lambda ]}_t}(f)(x)\) satisfies (1.1) and that \(u(t,x):={P^{[\lambda ]}_t}(f)(x)\) and \(v(t,x):={Q^{[\lambda ]}_t}(f)(x)\) satisfy the Cauchy–Riemann equations (1.2).

Proposition 2.9

([79, 80]) For any fixed t and \(x\in {\mathbb {R}}_+\), \( {P^{[\lambda ]}_t}(x, \cdot )\), \( {Q^{[\lambda ]}_t}(x, \cdot )\), \( t\partial _tP_t^{[\lambda ]}(x, \cdot )\) and \( t\partial _y{P^{[\lambda ]}_t}(x, \cdot )\) as functions of x are in \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\) for all \(\beta ,\gamma \in (0,1]\); symmetrically, for any fixed t and \(y\in {\mathbb {R}}_+\), \(t\partial _tP_t^{[\lambda ]}(\cdot , y)\) and \( t\partial _y{P^{[\lambda ]}_t}(\cdot , y)\) are in \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta ,\gamma )\) for all \(\beta ,\gamma \in (0,1]\).

Based on Definition 2.3 and Proposition 2.9, we further point out the following result.

Proposition 2.10

For any fixed \(t_1\), \(t_2\), \(x_1\) and \(x_2\in {\mathbb {R}}_+\),

$$\begin{aligned}&{P^{[\lambda ]}_{t_1}}(x_1, \cdot ){P^{[\lambda ]}_{t_2}}(x_2,\cdot ),\, {P^{[\lambda ]}_{t_1}}(x_1, \cdot ){Q^{[\lambda ]}_{t_2}}(x_2,\cdot ),\,\\&{Q^{[\lambda ]}_{t_1}}(x_1, \cdot ){P^{[\lambda ]}_{t_2}}(x_2,\cdot ),\,{Q^{[\lambda ]}_{t_1}}(x_1, \cdot ){Q^{[\lambda ]}_{t_2}}(x_2,\cdot ) \end{aligned}$$

as functions of \((x_1,x_2)\) are in \({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\) for all \(\beta _1,\gamma _1,\beta _2,\gamma _2\in (0,1]\).

3 Characterisation of Hardy Space \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) via \(S_d(f)\)

In this section, we obtain the \(L^p({\mathbb {R}}_\lambda )\)-boundedness of a general version of product Littlewood–Paley area functions and square functions, and then we show that the product Hardy space \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) coincides with the known Hardy space \(H^{p}({\mathbb {R}}_\lambda )\).

3.1 \(L^p({\mathbb {R}}_\lambda )\)-Boundedness of the Product Littlewood–Paley Area Functions and Square Functions

In this subsection, we provide the \(L^p({\mathbb {R}}_\lambda )\)-boundedness of a general version of the product Littlewood–Paley area functions and square functions for \(1<p<\infty \), which covers S(f) and \(S_u(f)\) defined in (1.4) and (1.5), respectively. Before doing this, we first consider a family of integral operators \(\{Q_t\}_{t>0}\) acting on functions on \({\mathbb {R}}_+\). We assume that for each \(t>0\), the kernel \(Q_t(x,y)\) of \(Q_t\) satisfies the following three conditions:

\({\mathrm{(K_i)}}\):

for any \(x,\,y,\,t \in {\mathbb {R}}_+\),

$$\begin{aligned} |Q_t(x, y)|\lesssim \frac{1}{m_\lambda (I(x, t))+m_\lambda (I(y,t))+m_\lambda (I(x, |x-y|))}\frac{t}{|x-y|+t}; \end{aligned}$$
\({\mathrm{(K_{ii}})}\):

for any \(x,y,\widetilde{y},t \in {\mathbb {R}}_+\) with \(|y-\widetilde{y}|\le (t+|x-y|)/2\),

$$\begin{aligned}&|Q_t(x, y)-Q_t(x,\widetilde{y})|\\&\quad \lesssim \frac{1}{m_\lambda (I(x, t))+m_\lambda (I(y,t))+m_\lambda (I(x, |x-y|))}\frac{t|y-\widetilde{y}|}{(|x-y|+t)^2}; \end{aligned}$$
\({\mathrm{(K_{iii}})}\):

for any \(t,\, x\in {\mathbb {R}}_+\),

$$\begin{aligned} Q_t(1)(x):={\int _0^\infty }Q_t(x, y){\,dm_\lambda (y)}=0. \end{aligned}$$

Now we consider a version of the product Littlewood–Paley g-function \({\mathbb {G}}(f)\) defined by setting, for any \(f\in L^p({\mathbb {R}}_\lambda )\) with \(p\in (1, \infty )\) and \((x_1, x_2)\in {\mathbb {R}}_\lambda \),

$$\begin{aligned} {\mathbb {G}}(f)(x_1, x_2) :=\left\{ \int _0^\infty \int _0^\infty \Big |Q^{(1)}_{t_1}Q^{(2)}_{t_2}(f)(x_1,x_2)\Big |^2 {dt_1\over t_1}{dt_2\over t_2} \right\} ^{1\over 2}, \end{aligned}$$

where for \(i = 1, 2\), \(\{Q^{(i)}_{t_i}\}_{t_i>0}\) is a family of integral operators on \({\mathbb {R}}_+\) satisfying that for each \(t_i\), the kernel \(Q^{(i)}_{t_i}(x_i, y_i)\) of \(Q^{(i)}_{t_i}\) satisfies properties \({\mathrm{(K_i)}}\)-\({\mathrm{(K_{iii}})}\) as above. Then we have the following \(L^p({\mathbb {R}}_\lambda )\)-boundedness of \({\mathbb {G}}(f)\).

Theorem 3.1

Let \(p\in (1, \infty )\), \(Q^{(1)}_{t_1}\) and \(Q^{(2)}_{t_2}\) be the same as above. Then for every \(f\in L^p({\mathbb {R}}_\lambda )\), and almost all \(x_2\), we have that

$$\begin{aligned} \bigg \Vert \left\{ \int _0^\infty \Big |Q^{(1)}_{t_1}(f(\cdot ,x_2))(x_1)\Big |^2 {dt_1\over t_1} \right\} ^{1/2}\bigg \Vert _{L^p({\mathbb {R}}_+,\,{dm_\lambda }(x_1))}\lesssim \Vert f(\cdot ,x_2)\Vert _{L^p({\mathbb {R}}_+,\,{dm_\lambda }(x_1))},\nonumber \\ \end{aligned}$$
(3.1)

and similar result holds for \(Q^{(2)}_{t_2}\), and that

$$\begin{aligned} \Vert {\mathbb {G}}(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{L^p({\mathbb {R}}_\lambda )}. \end{aligned}$$
(3.2)

Proof

We first prove (3.2). To this end, it suffices to prove that

$$\begin{aligned} \Vert {\mathbb {G}}(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \big \Vert S_d(f) \big \Vert _{L^p({\mathbb {R}}_\lambda )}, \end{aligned}$$
(3.3)

where \(S_d\) is defined as in Definition 2.4. To see this, we now recall Calderón’s reproducing formula ([43, Theorem 2.9])

$$\begin{aligned} f(x_1,x_2)&= \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} m_\lambda (I_{\alpha _1}^{k_1})m_\lambda (I_{\alpha _2}^{k_2}) \nonumber \\&\quad \times {\tilde{D}}^{(1)}_{k_1}(x_1,x_{I_{\alpha _1}^{k_1}}) {\tilde{D}}^{(2)}_{k_2}(x_2,x_{I_{\alpha _2}^{k_2}})D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}}), \end{aligned}$$
(3.4)

where the series converges in the sense of \(L^p({\mathbb {R}}_\lambda )\), for \(i:=1,2\), \({\tilde{D}}^{(i)}_{k_i}\) satisfies the same size, smoothness and cancellation conditions as \(D^{(i)}_{k_i}\) does.

Observe that \(Q^{(1)}_{t_1}Q^{(2)}_{t_2}\) is bounded on \({L^p({\mathbb {R}}_\lambda )}\). Applying Calderón’s reproducing formula to the left-hand side of (3.3), we obtain that

$$\begin{aligned} {\mathbb {G}}(f)(x_1,x_2)^2&=\sum _{\widetilde{k}_1\in {\mathbb {Z}}}\sum _{\widetilde{k}_2\in {\mathbb {Z}}} \int _{2^{-\widetilde{k}_1-1}}^{2^{-\widetilde{k}_1}}\int _{2^{-\widetilde{k}_2-1}}^{2^{-\widetilde{k}_2}} \bigg | \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} m_\lambda (I_{\alpha _1}^{k_1})m_\lambda (I_{\alpha _2}^{k_2}) \nonumber \\&\quad \times Q^{(1)}_{t_1}({\tilde{D}}^{(1)}_{k_1}(\cdot ,x_{I_{\alpha _1}^{k_1}}) )(x_1)Q^{(2)}_{t_2}({\tilde{D}}^{(2)}_{k_2}(\cdot ,x_{I_{\alpha _2}^{k_2}}))(x_2)\nonumber \\&\quad \times D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}})\bigg |^2 {dt_1\over t_1}{dt_2\over t_2}. \end{aligned}$$
(3.5)

Note that for \(t_1\in (2^{-\widetilde{k}_1-1}, 2^{-\widetilde{k}_1}]\) and \(t_2\in (2^{-\widetilde{k}_2-1}, 2^{-\widetilde{k}_2}]\),

$$\begin{aligned} Q^{(1)}_{t_1}({\tilde{D}}^{(1)}_{k_1}(\cdot ,x_{I_{\alpha _1}^{k_1}}) )(x_1)Q^{(2)}_{t_2}({\tilde{D}}^{(2)}_{k_2}(\cdot ,x_{I_{\alpha _2}^{k_2}}))(x_2) \end{aligned}$$

satisfies the following almost orthogonality estimate: for \(\epsilon \in (0, 1)\),

$$\begin{aligned}&\big |Q^{(1)}_{t_1}({\tilde{D}}^{(1)}_{k_1}(\cdot ,x_{I_{\alpha _1}^{k_1}}))(x_1) Q^{(2)}_{t_2}({\tilde{D}}^{(2)}_{k_2}(\cdot ,x_{I_{\alpha _2}^{k_2}}))(x_2)\big |\\&\quad \lesssim \prod _{i=1}^2 2^{-|k_i-\widetilde{k}_i|\epsilon }\bigg (\frac{2^{-k_i}+2^{-\widetilde{k}_i}}{|x_i-x_{I_{\alpha _i}^{k_i}}|+2^{-k_i}+2^{-\widetilde{k}_i}}\bigg )^\epsilon \nonumber \\&\qquad \times \frac{1}{m_\lambda (I(x_i, 2^{-k_i}+2^{-\widetilde{k}_i} ))+m_\lambda (I(x_{I_{\alpha _i}^{k_i}}, 2^{-k_i}+2^{-\widetilde{k}_i}))+m_\lambda (I(x_i, |x_i-x_{I_{\alpha _i}^{k_i}}|))},\nonumber \end{aligned}$$
(3.6)

and the implicit constant on the right-hand side of the above inequality is independent of \({{\tilde{k}}}_1\) and \({{\tilde{k}}}_2\). By substituting (3.6) back to the right-hand side of (3.5), we have that

$$\begin{aligned}&{\mathbb {G}}(f)(x_1,x_2)^2 \nonumber \\&\quad \lesssim \sum _{\widetilde{k}_1}\sum _{\widetilde{k}_2} \bigg | \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} \prod _{i=1}^2\bigg (\frac{2^{-k_i}+2^{-\widetilde{k}_i}}{|x_i-x_{I_{\alpha _i}^{k_i}}|+2^{-k_i}+2^{-\widetilde{k}_i}}\bigg )^\epsilon \nonumber \\&\qquad \times \frac{m_\lambda (I_{\alpha _i}^{k_i}) 2^{-|k_i-\widetilde{k}_i|\epsilon }}{m_\lambda (I(x_i, 2^{-k_i}+2^{-\widetilde{k}_i} ))+m_\lambda (I(x_{I_{\alpha _i}^{k_i}}, 2^{-k_i}+2^{-\widetilde{k}_i}))+m_\lambda (I(x_i, |x_i-x_{I_{\alpha _i}^{k_i}}|))}\nonumber \\&\qquad \times \left|D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}})\right| \bigg |^2\nonumber \\&\quad \lesssim \sum _{\widetilde{k}_1}\sum _{\widetilde{k}_2} \sum _{k_1} \sum _{k_2} 2^{-|k_1-\widetilde{k}_1|\epsilon }2^{-|k_2-\widetilde{k}_2|\epsilon } 2^{[(k_1\wedge \widetilde{k}_1)-k_1](2\lambda +1)(1-{1\over r})}2^{[(k_2\wedge \widetilde{k}_2)-k_2](2\lambda +1)(1-{1\over r})}\nonumber \\&\qquad \times \Bigg [ {\mathcal {M}}_1\bigg ( \sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} {\mathcal {M}}_2\Big ( \sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} \inf _{\genfrac{}{}{0.0pt}{}{y_1\in I_{\alpha _1}^{k_1}}{y_2\in I_{\alpha _2}^{k_2}}}\left|D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(y_1,y_2)\right|^r \nonumber \\&\qquad \times \chi _{I_{\alpha _2}^{k_2}}(\cdot )\Big )(x_2) \chi _{I_{\alpha _1}^{k_1}}(\cdot ) \bigg )(x_1) \Bigg ]^{2\over r}, \end{aligned}$$
(3.7)

where \(r<1\), \(a\wedge b:=\min \{a,b\}\),

$$\begin{aligned} {\mathcal {M}}_1f(x_1,x_2):=\displaystyle \sup _{I \ni x_1}\displaystyle \frac{1}{m_\lambda (I)}\displaystyle \int _I |f(y_1, x_2)|\,{dm_\lambda }(y_1), \end{aligned}$$
(3.8)

and

$$\begin{aligned} {\mathcal {M}}_2f(x_1,x_2):=\displaystyle \sup _{J \ni x_2}\displaystyle \frac{1}{m_\lambda (J)}\displaystyle \int _J |f(x_1, y_2)|\,{dm_\lambda }(y_2). \end{aligned}$$
(3.9)

By taking the square root and then the \(L^p({\mathbb {R}}_\lambda )\) norm on both sides of the above inequality and then using Fefferman–Stein’s vector-valued maximal function inequality, we obtain that (3.3) holds. Hence, we have that (3.2) holds.

Following the same steps above, we now sketch the proof of (3.1). Applying the following version of Calderón’s reproducing formula that for almost all x,

$$\begin{aligned} f(x_1,x_2)&= \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} m_\lambda (I_{\alpha _1}^{k_1}) {\tilde{D}}^{(1)}_{k_1}(x_1,x_{I_{\alpha _1}^{k_1}}) D^{(1)}_{k_1}(f)(x_{I_{\alpha _1}^{k_1}},x_{2}) \end{aligned}$$

to the left-hand side of (3.1), we obtain that

$$\begin{aligned} \widetilde{{\mathbb {G}}}(f)(x_1, x_2)^2:= & {} \int _0^\infty \left|Q^{(1)}_{t_1}(f)(x_1,x_2)\right|^2 {dt_1\over t_1}\\= & {} \sum _{\widetilde{k}_1\in {\mathbb {Z}}}\sum _{\widetilde{k}_2\in {\mathbb {Z}}} \int _{2^{-\widetilde{k}_1-1}}^{2^{-\widetilde{k}_1}}\bigg | \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} m_\lambda (I_{\alpha _1}^{k_1})\nonumber \\&\times Q^{(1)}_{t_1}({\tilde{D}}^{(1)}_{k_1}(\cdot ,x_{I_{\alpha _1 }^{k_1}}))(x_1)D^{(1)}_{k_1}(f)(x_{I_{\alpha _1}^{k_1}},x_{2}) \bigg |^2 {dt_1\over t_1}.\nonumber \end{aligned}$$

Then using the almost orthogonality estimate for \(Q^{(1)}_{t_1}({\tilde{D}}^{(1)}_{k_1}(x_1,x_{I_{\alpha _1}^{k_1}}))\), we have the following estimate:

$$\begin{aligned} \widetilde{{\mathbb {G}}}(f)(x_1, x_2)^2&\lesssim \sum _{\widetilde{k}_1\in {\mathbb {Z}}}\sum _{k_1} 2^{-|k_1-\widetilde{k}_1|\epsilon }2^{[(k_1\wedge \widetilde{k}_1)-k_1](2\lambda +1)(1-{1\over r})}\\&\quad \times \Bigg [ {\mathcal {M}}_1\bigg ( \sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \inf _{y_1\in I_{\alpha _1}^{k_1}}\left|D^{(1)}_{k_1}(f)(y_1,x_2)\right|^r \chi _{I_{\alpha _1}^{k_1}}(\cdot ) \bigg )(x_1) \Bigg ]^{2\over r}, \end{aligned}$$

where \(r<1\). By taking the square root and then the \(L^p({\mathbb {R}}_\lambda )\) norm on both sides of the above inequality and then using the Fefferman–Stein vector-valued maximal function inequality, we obtain that (3.1) holds. Similarly, we can obtain that (3.1) holds for \(Q^{(2)}_{t_2}\). \(\square \)

We remark that in the proof of Theorem 3.1, (3.3) shows that the Littlewood–Paley g-function \( {\mathbb {G}}(f)\) is dominated by \(S_d(f)\) in \(L^p({\mathbb {R}}_\lambda )\) for \(p\in (1, \infty )\). Moreover, observe that the estimate (3.7) also holds for \({2\lambda +1\over 2\lambda +2}<p\le 1\) and \({2\lambda +1\over 2\lambda +2}<r<p\). Then by the \(L^p({\mathbb {R}}_\lambda )\)-boundednss of the maximal operators \({\mathcal {M}}_1\) and \( {\mathcal {M}}_2\) for \(p\in (1, \infty )\), we further see that (3.3) also holds for \({2\lambda +1\over 2\lambda +2}<p\le 1\), which is stated as follows.

Proposition 3.2

Let \(p\in ({2\lambda +1\over 2\lambda +2}, 1]\), \(Q^{(1)}_{t_1}\) and \(Q^{(2)}_{t_2}\) be the same as above. Then for every \(f\in H^p({\mathbb {R}}_\lambda )\cap L^2({\mathbb {R}}_\lambda )\), we have that

$$\begin{aligned} \Vert {\mathbb {G}}(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert S_d(f)\Vert _{L^p({\mathbb {R}}_\lambda )}. \end{aligned}$$

Next we provide the \(L^p({\mathbb {R}}_\lambda )\)-boundedness (for \(1<p<\infty \)) of the product Littlewood–Paley area functions associated with the operators \(Q^{(1)}_{t_1}\) and \(Q^{(2)}_{t_2}\).

Theorem 3.3

Let \(Q^{(1)}_{t_1}\) and \(Q^{(2)}_{t_2}\) be the same as above. Then for \(1<p<\infty \) and for every \(f\in L^p({\mathbb {R}}_\lambda )\), we have that

$$\begin{aligned}&\bigg \Vert \left\{ \iint _{\Gamma (x) } \Big |Q^{(1)}_{t_1}Q^{(2)}_{t_2}(f)(y_1,y_2)\Big |^2 {d\mu _\lambda (y_1,y_2) dt_1 dt_2 \over t_1 m_\lambda (I(x_1, t_1)) t_2 m_\lambda (I(x_2, t_2))} \right\} ^{1\over 2}\bigg \Vert _{L^p({\mathbb {R}}_\lambda )}\nonumber \\&\quad \lesssim \Vert f\Vert _{L^p({\mathbb {R}}_\lambda )}, \end{aligned}$$
(3.10)

where \(\Gamma (x):=\Gamma (x_1)\times \Gamma (x_2)\).

Proof

The proof of this theorem is similar to that of Theorem 3.1 and so we briefly sketch the proof. From Calderón’s reproducing formula (3.4) and the almost orthogonality estimate (3.6), we have

$$\begin{aligned}&\iint _{\Gamma (x) } \Big |Q^{(1)}_{t_1}Q^{(2)}_{t_2}(f)(y_1,y_2)\Big |^2 {d\mu _\lambda (y_1,y_2) dt_1 dt_2 \over t_1 m_\lambda (I(x_1, t_1)) t_2 m_\lambda (I(x_2, t_2))} \\&\quad \lesssim \sum _{k_1,\,k_2} \Bigg [ {\mathcal {M}}_1\bigg ( \sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} {\mathcal {M}}_2\Big ( \sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} \inf _{\genfrac{}{}{0.0pt}{}{y_1\in I_{\alpha _1}^{k_1}}{y_2\in I_{\alpha _2}^{k_2}}}|D^{(1)}_{k_1}D^{(2)}_{k_2}(f)(y_1,y_2)|^r \chi _{I_{\alpha _2}^{k_2}}(\cdot )\Big )(x_2) \chi _{I_{\alpha _1}^{k_1}}(\cdot ) \bigg )(x_1) \Bigg ]^{2\over r}, \end{aligned}$$

which implies that the left-hand side of (3.10) is bounded by \(\big \Vert S_d(f) \big \Vert _{L^p({\mathbb {R}}_\lambda )}\), which, together with (2.1), finishes the proof of Theorem 3.3. \(\square \)

Note that from Proposition 2.9 and Lemma 2.8\({\mathrm{(S_{iii}})}\), we see that the kernels of S(f) and \(S_u(f)\) satisfy the conditions \({\mathrm{(K_i)}}, {\mathrm{(K_{ii}})}\) and \({\mathrm{(K_{iii}})}\) listed in Sect. 3.1. As a direct consequence of Theorem 3.3, we have

Theorem 3.4

The product Littlewood–Paley area functions S(f) and \(S_u(f)\) are bounded operators on \(L^p({\mathbb {R}}_\lambda )\), \(1<p<\infty \).

3.2 Characterisation of Hardy Space \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) via \(S_d(f)\)

In this subsection, based on Theorem 3.4 and the atomic decomposition of \(H^{p}({\mathbb {R}}_\lambda )\), we now show that the Hardy space \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) coincides with the known Hardy space \(H^{p}({\mathbb {R}}_\lambda )\).

Theorem 3.5

Let \(p\in ((2\lambda +1)/(2\lambda +2), 1]\). The space \(H^{p}_{\Delta _\lambda }({\mathbb {R}}_\lambda ) \) coincides with the product Hardy space \(H^{p}({\mathbb {R}}_\lambda )\) and they have equivalent norms (or quasi-norms).

To prove this result, we note that the known product Hardy spaces \(H^p({\mathbb {R}}_\lambda )\) (Definition 2.5) is a subset of a larger distribution space while our \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) is a completion of the initial subspace in \(L^2({\mathbb {R}}_\lambda )\). Hence, to show that \(H^p({\mathbb {R}}_\lambda )\) and \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) coincide, the main approach used here is via atomic decompositions, i. e., showing that \(H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) has a particular atomic decomposition as in Proposition 2.7. Following an idea in [16], we show that by choosing some particular function for Calderón’s reproducing formula (Proposition 3.6), we obtain the atomic decomposition for \(H^{p}_{\Delta _\lambda }( {\mathbb {R}}_\lambda )\), which leads to \(H^p({\mathbb {R}}_\lambda )\cap L^2({\mathbb {R}}_\lambda )=H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\cap L^2({\mathbb {R}}_\lambda )\) with equivalent norms (quasi-norms).

Now we recall the following construction of \(\psi \) on \({\mathbb {R}}\) in [26]. Let \(\varphi :=-\pi i\chi _{\{\frac{1}{2}<|x|<1\}}\) and \(\psi \) the Fourier transform of \(\varphi \). That is,

$$\begin{aligned} \psi (s):=s^{-1}(2\sin (s/2)-\sin s). \end{aligned}$$

Consider the operator

$$\begin{aligned} \psi \big (t\sqrt{\Delta _\lambda }\big ):= \big (t\sqrt{\Delta _\lambda }\big )^{-1}\left[2\sin \big (t\sqrt{\Delta _\lambda }/2 \big )-\sin \big (t\sqrt{\Delta _\lambda }\big )\right]. \end{aligned}$$
(3.11)

Proposition 3.6

([26]) For all \(t\in (0, \infty )\), \(\psi \left(t\sqrt{\Delta _\lambda }\right)(1)=0\) and the kernel \(K_{\psi (t\sqrt{\Delta _\lambda })}\) of \(\psi (t\sqrt{\Delta _\lambda })\) has support contained in \(\{(x, y)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\, |x-y|\le t\}.\)

Proof of Theorem 3.5

Note that from the definition of \(H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\), we have \(L^2({\mathbb {R}}_\lambda )\cap H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\) is dense in \(H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\). Moreover, from [43, Proposition 2.19], we have that \(L^2({\mathbb {R}}_\lambda )\cap H^p({\mathbb {R}}_\lambda )\) is dense in \(H^p({\mathbb {R}}_\lambda )\). Thus, by a density argument, it suffices to show that

$$\begin{aligned} L^2({\mathbb {R}}_\lambda )\cap H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )=L^2({\mathbb {R}}_\lambda )\cap H^p({\mathbb {R}}_\lambda ) \end{aligned}$$
(3.12)

with equivalent norms.

We first prove that for every \(f\in L^2({\mathbb {R}}_\lambda )\cap H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\), f belongs to \(H^p({\mathbb {R}}_\lambda )\) and

$$\begin{aligned} \Vert f\Vert _{H^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )}. \end{aligned}$$
(3.13)

To prove this argument, it suffices to show that for every \(f\in L^2({\mathbb {R}}_\lambda )\cap H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\), f has an atomic decomposition, with atoms satisfying the properties as in Definition 2.6.

To see this, we adapt the proof of the atomic decomposition in [16, Proposition 3.4] to our current setting of Bessel operators. Let \(f\in L^2({\mathbb {R}}_\lambda )\cap H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\). For each \(j\in {{\mathbb {Z}}}\), define

$$\begin{aligned} \Omega _j&:=\left\rbrace (x_1, x_2)\in {\mathbb {R}}_\lambda :\, Sf(x_1,x_2)>2^j\right\lbrace ,\\ B_j&:=\left\rbrace R:=I^{k_1}_{\alpha _1}\times I^{k_2}_{\alpha _2}:\, \mu _\lambda (R\cap \Omega _j)>\mu _\lambda (R)/2,\,\mu _\lambda (R\cap \Omega _{j+1})\le \mu _\lambda (R)/2\right\lbrace , \end{aligned}$$

and

$$\begin{aligned} {\widetilde{\Omega }}_j:=\left\rbrace (x_1, x_2)\in {\mathbb {R}}_\lambda :\, {{\mathcal {M}}_S}(\chi _{\Omega _j})(x_1,x_2)>1/2\right\lbrace . \end{aligned}$$

Here we use \(I^{k_i}_{\alpha _i}\) (\(i=1,2\)) to denote the dyadic intervals as stated in Sect. 2.1 and \({{\mathcal {M}}_S}\) is the maximal function defined by

$$\begin{aligned} {{{\mathcal {M}}_S}}(f)(x_1, x_2):=\displaystyle \sup _{\genfrac{}{}{0.0pt}{}{R\ni (x_1, x_2)}{R:=I\times J}}\displaystyle \frac{1}{\mu _\lambda (R)}\iint _R|f(y_1,y_2)|\,d\mu _\lambda (y_1,y_2), \end{aligned}$$
(3.14)

where the supremum is taken over all rectangles \(R:=I\times J\) with intervals \(I,\,J\subset {\mathbb {R}}_+\).

For each dyadic rectangle \(R:= I^{k_1}_{\alpha _1}\times I^{k_2}_{\alpha _2}\) in \({\mathbb {R}}_\lambda \), the tent T(R) is defined as

$$\begin{aligned} T(R):= \left\rbrace (y_1, y_2, t_1, t_2): (y_1, y_2)\in R,\, t_1\in \left(2^{-k_1}, 2^{-k_1+1}\right], t_2\in \left(2^{-k_2}, 2^{-k_2+1}\right]\right\lbrace . \end{aligned}$$

We now consider the following reproducing formula

$$\begin{aligned} f=C_\psi \displaystyle \int _0^\infty \displaystyle \int _0^\infty \psi (t_1\sqrt{\Delta _\lambda })\psi (t_2\sqrt{\Delta _\lambda }) \big ( t_1\sqrt{\Delta _\lambda }e^{-t_1\sqrt{\Delta _\lambda }} \otimes t_2\sqrt{\Delta _\lambda }e^{-t_2\sqrt{\Delta _\lambda }}\big ) f\,\frac{dt_1}{t_1}\frac{dt_2}{t_2} \end{aligned}$$

in the sense of \(L^2({\mathbb {R}}_\lambda )\), where \(\psi (t_1\sqrt{\Delta _\lambda })\) and \(\psi (t_2\sqrt{\Delta _\lambda })\) are defined as in (3.11) and \(C_\psi \) is a constant depending on \(\psi \). Then we have

$$\begin{aligned} f(x_1, x_2)= & {} C_\psi \sum _{j\in {{\mathbb {Z}}}}\sum _{R\in B_j}\iiiint _{T(R)}\psi (t_1\sqrt{\Delta _\lambda })(x_1,y_1)\psi (t_2\sqrt{\Delta _\lambda })(x_2,y_2) \nonumber \\&\quad \big (t_1\sqrt{\Delta _\lambda }e^{-t_1\sqrt{\Delta _\lambda }} \otimes t_2\sqrt{\Delta _\lambda }e^{-t_2\sqrt{\Delta _\lambda }}\big )f(y_1, y_2)\,d\mu _\lambda (y_1,y_2)\frac{dt_1}{t_1}\frac{dt_2}{t_2}\nonumber \\=: & {} \sum _{j\in {{\mathbb {Z}}}}\alpha _ja_j(x_1, x_2) \end{aligned}$$
(3.15)

with

$$\begin{aligned} \alpha _j&:=C_\psi \bigg \Vert \bigg (\sum _{R\in B_j}\displaystyle \int _0^\infty \displaystyle \int _0^\infty \left|\big (t_1\sqrt{\Delta _\lambda }e^{-t_1\sqrt{\Delta _\lambda }} \otimes t_2\sqrt{\Delta _\lambda } e^{-t_2\sqrt{\Delta _\lambda }}\big )f(\cdot , \cdot )\right|^2\chi _{T(R)}\frac{dt_1}{t_1}\frac{dt_2}{t_2}\bigg )^{\frac{1}{2}}\bigg \Vert _{L^2({\mathbb {R}}_\lambda )}\\&\quad \times \mu _\lambda ({{\widetilde{\Omega }}}_j)^{1/p-1/2}, \end{aligned}$$

and

$$\begin{aligned} a_j(x_1, x_2)=\sum _{{{\bar{R}}}\in m({{\widetilde{\Omega }}}_j)}a_{j,\,{{\bar{R}}}}(x_1, x_2), \end{aligned}$$

where

$$\begin{aligned} a_{j,\,{{\bar{R}}}}&:=\sum _{R\in B_j,\,R\subset {{\bar{R}}}}\frac{1}{\alpha _j}\iiiint _{T(R)}\psi (t_1\sqrt{\Delta _\lambda }) (x_1,y_1)\psi (t_2\sqrt{\Delta _\lambda })(x_2,y_2)\\&\quad \big (t_1\sqrt{\Delta _\lambda }e^{-t_1\sqrt{\Delta _\lambda }} \otimes t_2\sqrt{\Delta _\lambda }e^{-t_2\sqrt{\Delta _\lambda }}\big )(f)(y_1, y_2)\,d\mu _\lambda (y_1,y_2)\frac{dt_1}{t_1}\frac{dt_2}{t_2}. \end{aligned}$$

Following the proof of [16, Proposition 3.4], we deduce that

$$\begin{aligned} \sum _j|\alpha _j|^p \le C\Vert f\Vert _{H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )}^p, \end{aligned}$$
(3.16)

that for each j, \(a_j\) is supported in \({{\widetilde{\Omega }}}_j\), and that \(\Vert a_j\Vert _{L^2({\mathbb {R}}_\lambda )}\le \mu _\lambda ({{\widetilde{\Omega }}}_j)^{1/2-1/p},\) which implies that \(a_j\) satisfies (1) and (2) in Definition 2.6. Moreover, each \(a_{j,\,{{\bar{R}}}}(x_1, x_2)\) is supported in CR, where C is a fixed positive constant, and

$$\begin{aligned} \sum _{{{\bar{R}}}\in m({{\widetilde{\Omega }}}_j)} \Vert a_{j,\,{\bar{R}}}\Vert ^2_{L^2({\mathbb {R}}_\lambda )}\le \mu _\lambda ({{\widetilde{\Omega }}}_j)^{1-2/p}. \end{aligned}$$

By Proposition 3.6, we also see that

$$\begin{aligned} \displaystyle \int a_{j,\,{{\bar{R}}}}(x_1, x_2)\,dm_\lambda (x_1)=\displaystyle \int a_{j,\,{{\bar{R}}}}(x_1, x_2)\,dm_\lambda (x_2)=0. \end{aligned}$$

This shows that \(a_j\) satisfies (3) in Definition 2.6. Combining these results, we get that for each j, \(a_j\) is an atom as in Definition 2.6. Hence, applying the result of Proposition 2.7 and (3.15) and (3.16), we obtain that (3.13) holds.

Next we prove that for every \(f\in L^2({\mathbb {R}}_\lambda )\cap H^p({\mathbb {R}}_\lambda ) \), f belongs to \(H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )\) and

$$\begin{aligned} \Vert f\Vert _{H^p_{{\Delta _\lambda }}( {\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p({\mathbb {R}}_\lambda )} . \end{aligned}$$
(3.17)

To see this, note that for every \(f\in L^2({\mathbb {R}}_\lambda )\cap H^p({\mathbb {R}}_\lambda ) \), from Proposition 2.7, we obtain that \( f=\sum _k \lambda _k a_k,\) where each \(a_k\) is an atom in Definition 2.6 and \(\sum _k|\lambda _k|^p\lesssim \Vert f\Vert _{H^p({\mathbb {R}}_\lambda )}^p\).

As S(f) is bounded on \(L^2({\mathbb {R}}_\lambda )\) (Theorem 3.4), and for any t, the kernel of \(t\sqrt{\triangle _\lambda }e^{-t\sqrt{\triangle _\lambda }}\) satisfies the conditions \({\mathrm{(K_i)}}, {\mathrm{(K_{ii}})}\) and \({\mathrm{(K_{iii}})}\). Then, by a standard argument, we have

$$\begin{aligned} \Vert S(a_k)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim 1, \end{aligned}$$

where the implicit constant is independent of \(a_k\). As a consequence, we have

$$\begin{aligned} \Vert S(f)\Vert _{L^p({\mathbb {R}}_\lambda )}^p\le \sum _k|\lambda _k|^p\Vert S(a_k)\Vert _{L^p({\mathbb {R}}_\lambda )}^p \lesssim \Vert f\Vert _{H^p({\mathbb {R}}_\lambda )}^{p}, \end{aligned}$$

which implies that (3.17) holds.

Combining the results of (3.13) and (3.17), we get that (3.12) holds, with equivalent norms. This completes the proof of Theorem 3.5. \(\square \)

We also have the following remarks.

Remark 3.7

We note that we can also define a version of product Hardy space, denoted by \(H^p_{\Delta _\lambda ,\,1}( {\mathbb {R}}_\lambda )\), as in Definition 1.1 with \(t\sqrt{\triangle _\lambda }e^{-t\sqrt{\triangle _\lambda }}\) replaced by \(Q_t:=t^2{\triangle _\lambda }e^{-t^2{\triangle _\lambda }}\). This product Hardy space also coincides with \(H^{p}({\mathbb {R}}_\lambda )\) and they have equivalent norms (or quasi-norms). See Proposition 3.9 below.

Remark 3.8

Define the square function associated with the operator \(\Delta _\lambda \), by setting for any \(f\in L^p({\mathbb {R}}_\lambda )\) with \(1<p<\infty \) and \(x:=(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\),

$$\begin{aligned} g(f)(x) := \bigg (\int _{0}^\infty \int _{0}^\infty \Big | Q_{t_1}\, Q_{t_2} f(x_1,x_2)\Big |^2\ { dt_1 dt_2 \over t_1 t_2 }\bigg )^{1\over 2}, \end{aligned}$$

where \(Q_t:=t^2{\triangle _\lambda }e^{-t^2{\triangle _\lambda }}\) or \(Q_t:=t\sqrt{{\triangle _\lambda }}e^{-t\sqrt{{\triangle _\lambda }}}\). We can also define the product Hardy spaces \(H^p_{\Delta _\lambda }( {\mathbb {R}}_\lambda )\) as in Definition 1.1 using g(f), denoted by \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\) and \(H^p_{\Delta _\lambda ,\,3}( {\mathbb {R}}_\lambda )\), respectively. These two versions of product Hardy spaces coincide with \(H^{p}({\mathbb {R}}_\lambda )\) and they have equivalent norms (or quasi-norms).

Based on the proof of Theorem 3.5 above, we now prove that the three versions of Hardy spaces, i. e., \(H^p_{\Delta _\lambda ,\,1}( {\mathbb {R}}_\lambda )\), \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\) and \(H^p_{\Delta _\lambda ,\,3}( {\mathbb {R}}_\lambda )\), as defined in Remarks 3.7 and 3.8, coincide with \(H^{p}({\mathbb {R}}_\lambda )\).

Proposition 3.9

For \(p\in ({2\lambda +1\over 2\lambda +2},1]\), the Hardy spaces \(H^p_{\Delta _\lambda ,\,1}( {\mathbb {R}}_\lambda )\), \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\) and \(H^p_{\Delta _\lambda ,\,3}( {\mathbb {R}}_\lambda )\) coincide with \(H^{p}({\mathbb {R}}_\lambda )\) and they have equivalent norms (or quasi-norms).

Proof

We first consider \(H^p_{\Delta _\lambda ,\,1}( {\mathbb {R}}_\lambda )\) defined by the Littlewood–Paley area function via the heat semigroup \(\{e^{-t{\triangle _\lambda }}\}_{t>0}\). Since the kernel of \(\{e^{-t{\triangle _\lambda }}\}_{t>0}\) satisfies conditions \({\mathrm{(K_i)}}, {\mathrm{(K_{ii}})}\) and \({\mathrm{(K_{iii}})}\), following the proof of Theorem 3.5, we obtain that \(H^p_{\Delta _\lambda ,\,1}( {\mathbb {R}}_\lambda )\) coincides with \(H^{p}({\mathbb {R}}_\lambda )\) and they have equivalent norms (or quasi-norms).

Next we consider \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\) defined by the Littlewood–Paley g-function via the Poisson semigroup \(\{e^{-t\sqrt{{\triangle _\lambda }}}\}_{t>0}\). Suppose \(f\in H^p({\mathbb {R}}_\lambda )\). Then from Proposition 3.2, we obtain that

$$\begin{aligned} \Vert g(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert S_d(f)\Vert _{L^p({\mathbb {R}}_\lambda )}, \end{aligned}$$

which shows that \(H^p_{\Delta _\lambda ,2}( {\mathbb {R}}_\lambda )\supseteq H^p({\mathbb {R}}_\lambda )\). Conversely, suppose \(f\in H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\). Following the proof of Proposition 3.2, we obtain that \(\Vert S_d(f)\Vert _{L^p({\mathbb {R}}_\lambda )} \lesssim \Vert g(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\) and hence \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\subseteq H^p({\mathbb {R}}_\lambda )\). As a consequence, we get that \(H^p_{\Delta _\lambda ,\,2}( {\mathbb {R}}_\lambda )\) coincides with \(H^p({\mathbb {R}}_\lambda )\) and they have equivalent norms. Similar argument holds for \(H^p_{\Delta _\lambda ,\,3}( {\mathbb {R}}_\lambda )\). \(\square \)

4 Proof of Theorem 1.4

This section is devoted to the proof of Theorem 1.4. To this end, we will prove the chain of six inequalities as in (1.7) by the following six steps, respectively.

Step 1: \(\Vert f\Vert _{H^p_{\Delta _\lambda }( {\mathbb {R}}_\lambda )} \le \Vert f\Vert _{H^p_{S_u}( {\mathbb {R}}_\lambda )}\) for \(f\in H^p_{S_u}( {\mathbb {R}}_\lambda )\cap L^2({\mathbb {R}}_\lambda )\).

Note that from the definitions of the area functions Sf and \(S_uf\) in (1.4) and (1.5) respectively, we have for \(f\in L^2({\mathbb {R}}_\lambda )\), \(S(f)(x) \le S_u(f)(x)\), which implies that \(\Vert f\Vert _{H^p_{\Delta _\lambda }( {\mathbb {R}}_\lambda )} \le \Vert f\Vert _{H^p_{S_u}( {\mathbb {R}}_\lambda )}\).

Step 2: \(\Vert f\Vert _{H^p_{S_u}( {\mathbb {R}}_\lambda )} \lesssim \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{P}}({\mathbb {R}}_\lambda )}\) for \(f\in H^p_{{{\mathcal {N}}}_{P}}( {\mathbb {R}}_\lambda )\cap L^2({\mathbb {R}}_\lambda )\).

We first recall again

$$\begin{aligned} {\nabla _{t,\,x}}u(t,x):=({\partial }_t u, {\partial }_x u),\, \,-{\triangle _{t,\,x}}u(t, x):=-{\Delta _\lambda }-\partial _t^2={\mathcal {D}}^*{\mathcal {D}}-\partial _t^2, \end{aligned}$$

where \({\mathcal {D}}^*:=-\partial _x - {2\lambda \over x}\) is the formal adjoint operator of \({\mathcal {D}}:=\partial _x\) in \({L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\).

Next we establish a lemma about finding the “conjugate pair” of functions \((\phi ,\psi )\) which plays a key role in this step.

Lemma 4.1

Let \(\phi \in C^\infty _c({\mathbb {R}}_+)\) such that \(\phi \ge 0\), \(\,{\mathrm {supp}\,}(\phi )\subset (0, 1)\) and

$$\begin{aligned} {\int _0^\infty }\phi (x)\,{dm_\lambda }(x)=1. \end{aligned}$$

Then there exists a function \(\psi (t, x, y)\) on \({\mathbb {R}}_+\times {\mathbb {R}}_+\times {\mathbb {R}}_+\), such that

  1. (i)

    for any function \(f\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) and \(t,\,x\in {\mathbb {R}}_+\),

    $$\begin{aligned} {\partial }_t (\phi _t{\sharp _\lambda }f)(x)= -{\mathcal {D}}^*[\psi (f)(t,x)]= {\partial }_x[\psi (f)(t,x)]+\frac{2\lambda }{x}\psi (f)(t,x), \end{aligned}$$
    (4.1)

    where

    $$\begin{aligned} \psi (f)(t,x):={\int _0^\infty }\psi (t, x, y)f(y){\,dm_\lambda (y)}; \end{aligned}$$
  2. (ii)

    for any \(t,\, x,\, y\) with \(|x-y|\ge t,\)

    $$\begin{aligned} \psi (t, x, y)\equiv 0; \end{aligned}$$
    (4.2)
  3. (iii)

    \(\psi (t, x, y)\) satisfies the conditions \({\mathrm{(K_i)}}\), \({\mathrm{(K_{ii}})}\) and \({\mathrm{(K_{iii}})}\).

Proof

First, by (2.2), we observe that

$$\begin{aligned} {\partial }_t{\tau ^{[\lambda ]}_x}\phi _t(y)= & {} -c_\lambda t^{-2\lambda -2}\displaystyle \int _0^\pi (\sin \theta )^{2\lambda -1}\left[(2\lambda +1)\phi \left(\frac{\sqrt{x^2+y^2-2xy\cos \theta }}{t}\right)\right.\\&\quad \left.+\frac{\sqrt{x^2+y^2-2xy\cos \theta }}{t}\phi ^\prime \left(\frac{\sqrt{x^2+y^2-2xy\cos \theta }}{t}\right)\right]\,d\theta . \end{aligned}$$

Note that (4.1) holds if \(\psi \) satisfies that for any txy,

$$\begin{aligned} {\partial }_t{\tau ^{[\lambda ]}_x}\phi _t(y)={\partial }_x\psi (t, x, y)+\frac{2\lambda }{x}\psi (t, x, y). \end{aligned}$$

Thus, we define

$$\begin{aligned}&\psi (t, x, y)\nonumber \\&\quad :=-c_\lambda t^{-2\lambda -2}x^{-2\lambda }\int _0^x \displaystyle \int _0^\pi (\sin \theta )^{2\lambda -1}\left[(2\lambda +1)\phi \left(\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\right)\right.\nonumber \\&\qquad \left.+\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\phi ^\prime \left(\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\right)\right]\,d\theta w^{2\lambda }dw. \end{aligned}$$
(4.3)

Then it is easy to see that \(\psi \) satisfies the equation (4.1).

Now we prove that (4.2) holds. In fact, for all \(x,\,y,\,z\in (0, \infty )\), let \(\triangle (x, y, z)\) be the area of a triangle with sides x, y, z if such a triangle exists. And then we define

$$\begin{aligned} D(x, y, z):=c_\lambda 2^{2\lambda -2}(xyz)^{-2\lambda +1}[\triangle (x, y, z)]^{2\lambda -2} \end{aligned}$$

if \(\triangle (x, y, z)\not =0\), and \(D(x, y, z):=0\) otherwise. By a change of variables argument, we obtain that

$$\begin{aligned} c_\lambda \displaystyle \int _0^\pi \phi _t\left(\sqrt{x^2+y^2-2xy\cos \theta }\right) (\sin \theta )^{2\lambda -1}\,d\theta =\displaystyle \int _0^\infty \phi _t(z)D(x, y, z)\,{dm_\lambda }(z). \end{aligned}$$

Recall that for all \(x,\,z\in (0, \infty )\),

$$\begin{aligned} \displaystyle \int _0^\infty D(x, y, z)\,{dm_\lambda }(y)=1; \end{aligned}$$
(4.4)

see in [40, p. 335, (6)]. By change of variables, we write

$$\begin{aligned} \psi (t, x,y)= & {} -t^{-2\lambda -2}x^{-2\lambda }\int _0^x{\int _0^\infty }D(w, y,z)\left[(2\lambda +1)\phi \left(\frac{z}{t}\right)+\frac{z}{t}\phi ^\prime \left(\frac{z}{t}\right)\right] z^{2\lambda }\,dzw^{2\lambda }dw\nonumber \\= & {} -t^{-1}x^{-2\lambda }\int _0^x{\int _0^\infty }D(w, y,z){\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]\,dzw^{2\lambda }\,dw\nonumber \\= & {} -t^{-1}x^{-2\lambda }{\int _0^\infty }\int _0^x D(w, y,z){\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]w^{2\lambda }\,dw\,dz. \end{aligned}$$
(4.5)

We first prove \(\psi (t, x, y)=0\) if \(x>y+t\). To this end, recall that \(\,{\mathrm {supp}\,}(\phi )\subset (0, 1)\). Then \(\phi (z/t)\not =0\) only if \(z\in (0, t)\). Also, by the definition of D(wyz), we see that \(D(w, y,z)\not =0\) only if \(|y-z|< w<y+z\). Then by (4.4) and the fact that \(\phi \in C^\infty _c({\mathbb {R}}_+)\), we have that

$$\begin{aligned} \psi (t, x, y)= & {} -t^{-1}x^{-2\lambda }{\int _0^\infty }\int _0^\infty D(w, y,z){\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]w^{2\lambda }\,dw\,dz\\= & {} -t^{-1}x^{-2\lambda }{\int _0^\infty }\int _0^\infty D(w, y,z)w^{2\lambda }\,dw{\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]\,dz\\= & {} -t^{-1}x^{-2\lambda }{\int _0^\infty }{\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]\,dz=0. \end{aligned}$$

Similarly, assume that \(x<y-t\). Then by the compact support of \(\phi \), we see that

$$\begin{aligned} w\le x<y-t\le y-z\le |y-z|, \end{aligned}$$

which shows that \(D(w, y, z)=0\). This together with the definition of \(\psi (t, x, y)\) implies that \(\psi (t, x, y)=0\).

Now we show \(\psi \) satisfies \({\mathrm{(K_i)}}\). By (4.2) and the doubling property of \(dm_\lambda \) (1.3), it suffices to show that for any \(t, \,x,\,y\) such that \(|x-y|<t\),

$$\begin{aligned} |\psi (t, x,y)|\lesssim \frac{1}{m_\lambda (I(x, t))}\sim \frac{1}{x^{2\lambda }t+t^{2\lambda +1}}. \end{aligned}$$
(4.6)

From (4.5), (4.4) and \(\phi \in C^\infty _c({\mathbb {R}}_+)\), we deduce that

$$\begin{aligned} |\psi (t, x,y)|\lesssim & {} t^{-1}x^{-2\lambda }\left|{\int _0^\infty }\int _0^x D(w, y,z){\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]w^{2\lambda }\,dw\,dz\right| \\\lesssim & {} t^{-1}x^{-2\lambda }. \end{aligned}$$

Moreover, if \(x\le t\), then from (4.5), (4.4) and \(\phi \in C^\infty _c({\mathbb {R}}_+)\), it follows that

$$\begin{aligned} |\psi (t, x,y)|\lesssim & {} t^{-2\lambda -2}x^{-2\lambda }\int _0^x{\int _0^\infty }D(w, y,z) \left|(2\lambda +1)\phi \left(\frac{z}{t}\right)+\frac{z}{t}\phi ^\prime \left(\frac{z}{t}\right)\right| z^{2\lambda }\,dzw^{2\lambda }dw\\\lesssim & {} t^{-2\lambda -2}x^{-2\lambda }\int _0^x{\int _0^\infty }D(w, y,z) z^{2\lambda } \,dzw^{2\lambda }dw \lesssim t^{-2\lambda -2}x\lesssim t^{-2\lambda -1}. \end{aligned}$$

Thus, \(\mathrm{(K_i)}\) holds.

Now we show \(\psi \) satisfies \({\mathrm{(K_{ii}})}\). By the assumption that \(|y-\widetilde{y}|\le (t+|x-y|)/2\) and the doubling property of \(dm_\lambda \), \(|x-y|+t\sim |x-\widetilde{y}|+t\) and

$$\begin{aligned}&m_\lambda (I(x, t))+m_\lambda (I(y,t))+m_\lambda (I(x, |x-y|))\\&\quad \sim m_\lambda (I(x, t))+m_\lambda (I(\widetilde{y},t))+m_\lambda (I(x, |x-\widetilde{y}|)). \end{aligned}$$

Based on these facts, we may further assume that \(y>\widetilde{y}\). For otherwise, it is sufficient to show that

$$\begin{aligned}&|\psi (t, x, y)-\psi (t, x, \widetilde{y})|\nonumber \\&\quad \lesssim \frac{1}{m_\lambda (I(x, t))+m_\lambda (I(\widetilde{y},t))+m_\lambda (I(x, |x-\widetilde{y}|))}\frac{t|y-\widetilde{y}|}{(|x-\widetilde{y}|+t)^2}. \end{aligned}$$
(4.7)

Since \(y>\widetilde{y}\), if \(x>y+t\), then \(x>\widetilde{y}+t\), by which and (4.2), we see that \(\psi (t, x, y)=0\) and \(\psi (t, x, \widetilde{y})=0\). Similarly, if \(x<\widetilde{y}-t\), then \(x<y-t\) and by (4.2) again, \(\psi (t, x, y)=0\) and \(\psi (t, x, \widetilde{y})=0\). Hence, \({\mathrm{(K_{ii}})}\) holds trivially if \(x>y+t\) or \(x<\widetilde{y}-t\). Moreover, observe that

$$\begin{aligned}{}[\widetilde{y}-t, y+t]\supset [\widetilde{y}-t, \widetilde{y}+t]\cup [y-t, y+t], \end{aligned}$$

and when \({{\tilde{y}}}+t< y-t\), \(\psi (t, x, y)-\psi (t, x,\widetilde{y})=0\) for \(x\in ({{\tilde{y}}}+t, y-t)\). By similarity and (4.7), we only need to consider the case that \(y-t\le x\le y+t\). It then suffices to show that

$$\begin{aligned} |\psi (t, x, y)-\psi (t, x,\widetilde{y})|\lesssim \frac{1}{m_\lambda (I(y,t))+m_\lambda (I(y, |x-y|))}\frac{|y-\widetilde{y}|}{t}. \end{aligned}$$
(4.8)

We write

$$\begin{aligned}&|\psi (t, x, y)-\psi (t, x,\widetilde{y})|\\&\quad \lesssim t^{-2\lambda -2}x^{-2\lambda }\displaystyle \int _0^\pi \int _0^x (\sin \theta )^{2\lambda -1} \left|\phi \left(\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\right)\right.\\&\qquad \left.-\phi \left(\frac{\sqrt{w^2+\widetilde{y}^2-2w\widetilde{y}\cos \theta }}{t} \right)\right|w^{2\lambda }dwd\theta \\&\qquad + t^{-2\lambda -2}x^{-2\lambda }\displaystyle \int _0^\pi \int _0^x (\sin \theta )^{2\lambda -1} \left|\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\phi ^\prime \left(\frac{\sqrt{w^2+y^2-2wy\cos \theta }}{t}\right)\right.\\&\qquad \left.-\frac{\sqrt{w^2+{\widetilde{y}}^2-2w\widetilde{y}\cos \theta }}{t}\phi ^\prime \left(\frac{\sqrt{w^2+{\widetilde{y}}^2-2w\widetilde{y}\cos \theta }}{t}\right)\right|w^{2\lambda }dwd\theta \\&\quad =:{\mathrm{I}}+{\mathrm{II}}. \end{aligned}$$

We only consider the term \({\mathrm{I}}\), since for the term \({\mathrm{II}}\), we consider the function \(\widetilde{\phi }(x):=x \phi ^\prime (x)\), and then the form of \({\mathrm{II}}\) will be the same as \({\mathrm{I}}\).

For the term \({\mathrm{I}}\), we first note that from the mean value theorem,

$$\begin{aligned} {\mathrm{I}}&\lesssim t^{-2\lambda -2}x^{-2\lambda }\displaystyle \int _0^\pi \int _0^x (\sin \theta )^{2\lambda -1}\left|\phi ^\prime \left(\frac{\sqrt{w^2+\xi ^2-2w\xi \cos \theta }}{t}\right)\right|\\&\quad \times \frac{|\xi -w\cos \theta ||y-\widetilde{y}|}{t\sqrt{w^2+\xi ^2-2w \xi \cos \theta }}w^{2\lambda }dwd\theta \\&\lesssim t^{-2\lambda -2}x^{-2\lambda }\displaystyle \int _0^\pi \int _0^x (\sin \theta )^{2\lambda -1} \left|\phi ^\prime \left(\frac{\sqrt{w^2+\xi ^2-2w\xi \cos \theta }}{t}\right)\right| \frac{|y-\widetilde{y}|}{t}w^{2\lambda }dwd\theta , \end{aligned}$$

where \(\xi \) depends on \(\phi \), y and \({{\tilde{y}}}\). To continue, we consider the following two cases.

Case (i) \(0<y\le 8t\). In this case, by \(x<y+t\), we have \(x<9t\). Again, since \(|\phi '(x)|\lesssim 1\), we get that

$$\begin{aligned} {\mathrm{I}}\lesssim & {} x^{-2\lambda }|y-\widetilde{y}|\ t^{-2\lambda -3}\displaystyle \int _0^\pi \int _0^x (\sin \theta )^{2\lambda -1} w^{2\lambda }dwd\theta \\\lesssim & {} x^{-2\lambda }|y-\widetilde{y}|\ t^{-2\lambda -3} x^{2\lambda +1}\\\lesssim & {} |y-\widetilde{y}|t^{-2\lambda -2}\\\lesssim & {} \frac{1}{m_\lambda (I(y,t))+m_\lambda (I(y, |x-y|))}\frac{|y-{{\tilde{y}}}|}{t}. \end{aligned}$$

Case (ii) \(8t<y\). In this case, we have \(|x-y|<y/2\). For otherwise, we have \(|x-y|>4t\), which contradicts to the assumption that \(y-t\le x\le y+t\). Observe that

$$\begin{aligned} |y-\xi |<|y-\widetilde{y}|\le \frac{t+|x-y|}{2}<\frac{5}{16}y, \end{aligned}$$

which implies that \(\xi \sim y\sim \widetilde{y} \sim x\). Thus, we see that

$$\begin{aligned} {\mathrm{I}}\lesssim & {} t^{-2\lambda -3}y^{-2\lambda }|y-\widetilde{y}|\displaystyle \int _0^\infty \int _0^\infty D(\xi , w, z) \left|\phi ^\prime \left(\frac{z}{t}\right)\right| w^{2\lambda }dw z^{2\lambda }dz\\\lesssim & {} t^{-2\lambda -3}y^{-2\lambda }|y-\widetilde{y}|\int _0^\infty \left|\phi ^\prime \left(\frac{z}{t}\right)\right|z^{2\lambda }dz\\\lesssim & {} t^{-2}y^{-2\lambda }|y-\widetilde{y}|\int _0^\infty \left|\phi ^\prime \left(z\right)\right|z^{2\lambda }dz\\\lesssim & {} t^{-2}y^{-2\lambda }|y-\widetilde{y}|\\\lesssim & {} \frac{1}{m_\lambda (I(y,t))+m_\lambda (I(y, |x-y|))}\frac{|y-{{\tilde{y}}}|}{t}. \end{aligned}$$

Combining the cases above we conclude that (4.8) holds, which implies \({\mathrm{(K_{ii}})}\).

Finally, we show that \({\mathrm{(K_{iii}})}\) holds. Indeed, by (4.5) together with (4.4) and \(\phi \in C^\infty _c({\mathbb {R}}_+)\), we conclude that

$$\begin{aligned}&{\int _0^\infty }\psi (t, x, y){dm_\lambda }(y)\\&\quad =-t^{-1}x^{-2\lambda }{\int _0^\infty }{\int _0^\infty }\int _0^x D(w, y,z){\partial }_z\left[\left(\frac{z}{t} \right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]w^{2\lambda }\,dw\,dz{dm_\lambda }(y)\\&\quad =-t^{-1}x^{-2\lambda }\int _0^x{\int _0^\infty }{\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]\,dz w^{2\lambda }\,dw\\&\quad =-\frac{x}{(2\lambda +1)t}{\int _0^\infty }{\partial }_z\left[\left(\frac{z}{t}\right)^{2\lambda +1}\phi \left(\frac{z}{t}\right)\right]\,dz =0. \end{aligned}$$

This shows \({\mathrm{(K_{iii}})}\), and hence finishes the proof of Lemma 4.1. \(\square \)

Next, we establish a version of Merryfield’s lemma ( [67, Lemma 3.1]) in the Bessel operator setting.

Lemma 4.2

Let \(p\in ((2\lambda +1)/(2\lambda +2), 1]\) and \(\phi \) be as in Lemma 4.1. Then there exists a positive constant C such that for any \(f,\,g\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) with \(fg\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) and \(u(t, x):={P^{[\lambda ]}_t}f(x)\) satisfying \(\sup _{|y-x|<t}|u(t,y)|\in L^p({\mathbb {R}}_+,{dm_\lambda })\),

$$\begin{aligned} {\mathrm{I}}&:= {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|{\nabla _{t,\,x}}u(t,x)|^2\left|\phi _t\sharp _\lambda g(x)\right|^2 t{dm_\lambda }(x)\,dt\\&\le C \left[{\int _{{\mathbb {R}}_+}}[f(x)]^2[g(x)]^2{dm_\lambda }(x)+{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t,x)|^2|Q_t(g)(x)|^2{\frac{{dm_\lambda }(x)\,dt}{t}}\right], \end{aligned}$$

where \(Q_t( g)(x) := \big ( t \partial _t (\phi _t\sharp _\lambda g)(x),\ t \partial _x (\phi _t\sharp _\lambda g)(x),\psi ( g)(t,x)\big )\) is a vector-valued function, with \(\psi ( g)(t,x)\) as obtained in Lemma 4.1.

Proof

First, we claim that \(u(t, x)\rightarrow 0\) as \(t\rightarrow \infty \). Indeed, observe that for any \(x,y,\,t\in {\mathbb {R}}_+\) with \(|y-x|<t\),

$$\begin{aligned} |u(t, x)|=\left|{P^{[\lambda ]}_t}f(x)\right|\le \sup _{|y-z|<t}|u(t,z)|. \end{aligned}$$

Since \(\sup _{|x-y|<t}|u(t,y)|\in {L^p({{\mathbb {R}}}_+,\, dm_\lambda )}\), we have that

$$\begin{aligned} \left|{P^{[\lambda ]}_t}f(x)\right|^p\le & {} \frac{1}{m_\lambda (I(x, t))} \displaystyle \int _{I(x,\, t)}\bigg |\sup _{|y-z|<t}|u(t,z)|\bigg |^p{\,dm_\lambda (y)}\\\le & {} \frac{1}{m_\lambda (I(x, t))}\left\Vert\sup _{|\cdot -z|<t}|u(t,z)|\right\Vert_{L^p({{\mathbb {R}}}_+,\, dm_\lambda )}^p. \end{aligned}$$

This means that \(u(t, x)\rightarrow 0\) as \(t\rightarrow \infty \) and the claim follows.

We now claim that

$$\begin{aligned} 2|{\nabla _{t,\,x}}u(t,x)|^2=-{\triangle _{t,\,x}}(u^2(t,x)). \end{aligned}$$
(4.9)

In fact, recall that u satisfies the equation (1.1). We then see that

$$\begin{aligned} -{\triangle _{t,\,x}}(u^2(t,x))={\partial }_t^2 u^2+{\partial }_x^2 u^2+\frac{2\lambda }{x}{\partial }_x u^2=2\left[({\partial }_t u)^2+({\partial }_x u)^2\right]=2|{\nabla _{t,\,x}}u(t,x)|^2. \end{aligned}$$

This implies claim (4.9).

From the claim (4.9) and integration by parts, we deduce that

$$\begin{aligned} -2{\mathrm{I}}&={\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\triangle _{t,\,x}}(u^2(t,x)) \left|\phi _t\sharp _\lambda g(x)\right|^2 t{dm_\lambda }(x)\,dt\\&=-{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\mathcal {D}}^*{\mathcal {D}}(u^2(t,x))(\phi _t\sharp _\lambda g(x))^2 t\,{dm_\lambda }(x)\,dt\\&\quad +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\partial }^2_t (u^2(t,x))\left[t(\phi _t\sharp _\lambda g(x))^2\right]\,{dm_\lambda }(x)\,dt\\&=-{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\partial }_x(u^2(t,x))\ {\partial }_x (\phi _t\sharp _\lambda g(x))^2 t\,{dm_\lambda }(x)\,dt\\&\quad -{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\partial }_t (u^2(t,x))\ {\partial }_t\left[t(\phi _t\sharp _\lambda g(x))^2\right]\,{dm_\lambda }(x)dt\\&=-4{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x)\ {\partial }_x u(t,x)\ \phi _t\sharp _\lambda g(x)\ {\partial }_x(\phi _t\sharp _\lambda g(x))t\,{dm_\lambda }(x)\,dt\\&\quad -2 {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x) {\partial }_t u(t,x)\ (\phi _t\sharp _\lambda g(x))^2\,{dm_\lambda }(x)dt\\&\quad -4 {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x) {\partial }_t u(t,x)\ t (\phi _t\sharp _\lambda g(x))\ {\partial }_t (\phi _t\sharp _\lambda g(x))\,{dm_\lambda }(x)dt\\&=-4{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x){\nabla _{t,\,x}}u(t,x)\cdot t (\phi _t\sharp _\lambda g(x))\ {\nabla _{t,\,x}}(\phi _t\sharp _\lambda g(x))\ \,{dm_\lambda }(x)\,dt\\&\quad - {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\partial }_t (u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2\,{dm_\lambda }(x)dt\\&=:{\mathrm{A}}+{\mathrm{B}}, \end{aligned}$$

where in the third equality, we used the following facts:

$$\begin{aligned}&{\int _{{\mathbb {R}}_+}}\lim _{x\rightarrow 0^+}{\partial }_x(u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2 x^{2\lambda }t\,dt=0, \end{aligned}$$
(4.10)
$$\begin{aligned}&{\int _{{\mathbb {R}}_+}}\lim _{x\rightarrow \infty }{\partial }_x(u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2 x^{2\lambda }t\,dt=0, \end{aligned}$$
(4.11)
$$\begin{aligned}&{\int _{{\mathbb {R}}_+}}\lim _{t\rightarrow 0^+}{\partial }_t (u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2\,t x^{2\lambda }\,dx=0, \end{aligned}$$
(4.12)

and

$$\begin{aligned} {\int _{{\mathbb {R}}_+}}\lim _{t\rightarrow \infty }{\partial }_t (u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2\,t x^{2\lambda }\,dx=0. \end{aligned}$$
(4.13)

In fact, for each \(t,\,x\in {\mathbb {R}}_+\), applying Hölder’s inequality and Lemma 2.8 (iii), we see that

$$\begin{aligned} |u(t, x)|\le & {} \left[{\int _0^\infty }{P^{[\lambda ]}_t}(x, y)|f(y)|^2y^{2\lambda }dy\right]^{\frac{1}{2}}\nonumber \\\lesssim & {} \left[{\int _0^\infty }\displaystyle \frac{t}{(|x-y|^2+t^2)^{\lambda +1}}|f(y)|^2y^{2\lambda }dy\right]^{\frac{1}{2}}\nonumber \\\lesssim & {} t^{-\lambda -\frac{1}{2}}\Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}, \end{aligned}$$
(4.14)

and

$$\begin{aligned} |\phi _t\sharp _\lambda g(x)|&\lesssim \left|{\int _0^\infty }|{\tau ^{[\lambda ]}_x}\phi _t(y)||g(y)|^2 y^{2\lambda }\,dy\right|^\frac{1}{2}\Vert {\tau ^{[\lambda ]}_x}\phi _t\Vert ^{\frac{1}{2}}_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\nonumber \\&\sim \left|{\int _0^\infty }t^{-2\lambda -1}\Vert \phi \Vert _{L^\infty ({{\mathbb {R}}}_+,\, dm_\lambda )}|g(y)|^2y^{2\lambda }\,dy\right|^\frac{1}{2}\Vert \phi \Vert ^{\frac{1}{2}}_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\nonumber \\&\sim t^{-\lambda -\frac{1}{2}}\Vert \phi \Vert ^\frac{1}{2}_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\Vert \phi \Vert ^\frac{1}{2}_{L^\infty ({{\mathbb {R}}}_+,\, dm_\lambda )}\Vert g\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}. \end{aligned}$$
(4.15)

Moreover, recall that for any \(x,y,t\in {\mathbb {R}}_+\),

$$\begin{aligned} \left|\partial _x{P^{[\lambda ]}_t}(x,y)\right|\lesssim \min \left\{ \displaystyle \frac{t}{(|x-y|+t)^{2\lambda +3}}, \displaystyle \frac{t}{x^\lambda y^\lambda (|x-y|+t)^{3}}\right\} . \end{aligned}$$

Using the Hölder inequality, we then have that for any \(t, x\in {\mathbb {R}}_+\),

$$\begin{aligned} |{\partial }_x u(t,x)|&\lesssim \Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\left\{ \left[ \int _0^{2x}\frac{x^{2\lambda }}{(|x-y|+t)^{4\lambda +4}}dy\right] ^\frac{1}{2}\right. \\&\quad \left. +\left[ \int _{2x}^\infty \frac{1}{(y+t)^{2\lambda +4}}dy\right] ^\frac{1}{2}\right\} \\&\lesssim \Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\left\{ \frac{x^{\lambda +\frac{1}{2}}}{t^{2\lambda +2}}+\frac{1}{(x+t)^{\lambda +\frac{3}{2}}}\right\} . \end{aligned}$$

These facts imply (4.10).

To verify (4.11), note that

$$\begin{aligned} \left|{P^{[\lambda ]}_t}(x, y)\right|\lesssim \frac{1}{m_\lambda (I(x, |x-y|+t))}\frac{t}{|x-y|+t}. \end{aligned}$$

Then by Hölder’s inequality, we have that

$$\begin{aligned} |u(t,x)|&\lesssim \Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\left[ \sum _{k=0}^\infty 2^{-2k}\displaystyle \int _{|x-y|<2^kt}\frac{1}{[m_\lambda (I(x, 2^{k-1}t))]^2}{dm_\lambda }(y)\right] ^\frac{1}{2}\\&\lesssim \Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\frac{1}{x^{\lambda }t^\frac{1}{2}}, \end{aligned}$$

and

$$\begin{aligned} |{\partial }_x u(t,x)|\lesssim \frac{t}{x^\lambda }\left( \int _{{\mathbb {R}}_+}\frac{1}{(|x-y|+t)^6}dy\right) ^\frac{1}{2}\lesssim \frac{1}{x^\lambda t^{\frac{3}{2}}}. \end{aligned}$$

For fixed \(t>0\) and \(x>t\), the Hankel translation \({\tau ^{[\lambda ]}_x}\phi _t\) satisfies \(\,{\mathrm {supp}\,}({\tau ^{[\lambda ]}_x}\phi _t)\subset (x-t, x+t)\). From the Hölder inequality and the fact that \(g\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\), we deduce that

$$\begin{aligned} |\phi _t\sharp _\lambda g(x)|\le & {} \Vert {\tau ^{[\lambda ]}_x}\phi _t\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\Vert g\chi _{(x-t, x+t)}\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\\\le & {} t^{-\lambda -\frac{1}{2}}\Vert \phi \Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\Vert g\chi _{(x-t, x+t)}\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\rightarrow 0, \end{aligned}$$

as \(x\rightarrow \infty \). These facts imply (4.11).

Note that

$$\begin{aligned} \left|t\partial _t{P^{[\lambda ]}_t}(x,y)\right|\lesssim \displaystyle \frac{t}{(|x-y|^2+t^2)^{\lambda +1}}. \end{aligned}$$
(4.16)

Then we also have

$$\begin{aligned} |t{\partial }_t u(t,x)|&\lesssim t^{-\lambda -\frac{1}{2}}\Vert f\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}, \end{aligned}$$

which together with (4.14) and (4.15) implies (4.13).

To show (4.12), observe that the maximal function

$$\begin{aligned} \Phi _+(g)(x):=\sup _{t>0}|\phi _t\sharp _\lambda g(x)| \end{aligned}$$

is bounded on \({L^p({{\mathbb {R}}}_+,\, dm_\lambda )}\) for \(p\in (1, \infty )\). Then arguing as [21, Corollary 2.9], we see that for \(g\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\),

$$\begin{aligned} \lim _{t\rightarrow 0^+}\phi _t\sharp _\lambda g(x)=\Vert \phi \Vert _{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}g(x),\,\mathrm{\,a.\,e. \,}x\in {\mathbb {R}}_+. \end{aligned}$$
(4.17)

Similarly, we have that

$$\begin{aligned} \lim _{t\rightarrow 0^+}u(t,x)=f(x),\,\mathrm{\,a.\,e. \,}x\in {\mathbb {R}}_+. \end{aligned}$$
(4.18)

Then to show (4.12), it suffices to prove that for a. e. \(x\in {\mathbb {R}}_+\),

$$\begin{aligned} \lim _{t\rightarrow 0^+} t\partial _t u(t, x)=0. \end{aligned}$$
(4.19)

To see this, we first note that for any \(x,y, t\in {\mathbb {R}}_+\), \(t\partial _t {P^{[\lambda ]}_t}(x,y) = {P^{[\lambda ]}_t}(x,y) -{{{\tilde{P}}}^{[\lambda ]}_t}(x,y) \), where

$$\begin{aligned} {{{\tilde{P}}}^{[\lambda ]}_t}(x,y)=\frac{4\lambda (\lambda +1)}{\pi }\int _0^\pi \frac{t^3(\sin \theta ) ^{2\lambda -1}}{(x^2+y^2+t^2-2xy\cos \theta )^{\lambda +2}}d\theta . \end{aligned}$$

For any \(x,t\in {\mathbb {R}}_+\), since

$$\begin{aligned} \int _{{\mathbb {R}}_+} t\partial _t {P^{[\lambda ]}_t}(x,y)y^{2\lambda }dy=0 \quad {\mathrm{and}} \quad \int _{{\mathbb {R}}_+}{P^{[\lambda ]}_t}(x,y)y^{2\lambda }dy=1, \end{aligned}$$

we get that

$$\begin{aligned} \int _{{\mathbb {R}}_+}{{{\tilde{P}}}^{[\lambda ]}_t}(x,y)y^{2\lambda }dy=1. \end{aligned}$$

Moreover, \({{{\tilde{P}}}^{[\lambda ]}_t}\) is an approximation of the identity as \({P^{[\lambda ]}_t}\) (see [80, Lemma 2.1]). Set

$$\begin{aligned} {{{\tilde{P}}}^{[\lambda ]}_t}(f)(x)=\int _{{\mathbb {R}}_+}{{{\tilde{P}}}^{[\lambda ]}_t}(x,y)f(y)y^{2\lambda }dy,\quad x,t\in {\mathbb {R}}_+. \end{aligned}$$

Then via the argument as (4.17), we have that for \(f\in {L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) and a. e. \(x\in {\mathbb {R}}_+\),

$$\begin{aligned} \lim _{t\rightarrow 0^+}{{{\tilde{P}}}^{[\lambda ]}_t}(f)(x)=f(x). \end{aligned}$$

This, together with (4.18) and the fact that \(t\partial _t u(t, x)=u(t,x)-{{{\tilde{P}}}^{[\lambda ]}_t}(f)(x)\), implies (4.19). Hence, (4.12) holds.

For the term \({\mathrm{A}}\), using Hölder’s inequality and Cauchy’s inequality, we obtain that

$$\begin{aligned} |{\mathrm{A}}|\le & {} 4\left[{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)|{\nabla _{t,\,x}}(\phi _t\sharp _\lambda g(x))|^2 t {dm_\lambda }(x)\,dt\right]^{\frac{1}{2}} \nonumber \\&\times \left[{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t,\,x}}u(t,x)\right|^2\left|\phi _t\sharp _\lambda g(x)\right|^2t {dm_\lambda }(x)\,dt\right]^{\frac{1}{2}}\nonumber \\\le & {} C{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)|{\nabla _{t,\,x}}(\phi _t\sharp _\lambda g(x))|^2 t {dm_\lambda }(x)\,dt\nonumber \\&+ {1\over 8}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t,\,x}}u(t,x)\right|^2\left|\phi _t\sharp _\lambda g(x)\right|^2 t {dm_\lambda }(x)\,dt\nonumber \\= & {} C{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)|t{\nabla _{t,\,x}}(\phi _t\sharp _\lambda g(x))|^2 {{dm_\lambda }(x)\,dt \over t} + {{\mathrm{I}}\over 8}. \end{aligned}$$
(4.20)

For term \({\mathrm{B}}\), from integration by parts, we have

$$\begin{aligned} {\mathrm{B}}&= - \int _{{\mathbb {R}}_+} u^2(t,x)\ (\phi _t\sharp _\lambda g(x))^2\bigg |_{t=0}^{t=\infty } {dm_\lambda }(x) \\&\quad + {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ {\partial }_t (\phi _t\sharp _\lambda g(x))^2\,{dm_\lambda }(x)dt \\&=: {\mathrm{B}}_1+{\mathrm{B}}_2. \end{aligned}$$

By (4.14) and (4.15), we see that

$$\begin{aligned} {\int _{{\mathbb {R}}_+}}\lim _{t\rightarrow \infty }(u^2(t,x))\ (\phi _t\sharp _\lambda g(x))^2\,t x^{2\lambda }\,dx=0. \end{aligned}$$

Moreover, it follows from (4.17) and (4.18) that

$$\begin{aligned} {\mathrm{B}}_1 \sim \int _{{\mathbb {R}}_+} f(x)^2g(x)^2 {dm_\lambda }(x). \end{aligned}$$
(4.21)

For \({\mathrm{B}}_2\), we get that

$$\begin{aligned} {\mathrm{B}}_2= 2{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ {\partial }_t (\phi _t\sharp _\lambda g(x))\,{dm_\lambda }(x)dt. \end{aligned}$$

Then, using Lemma 4.1, for the function \(\phi \), there exists a function \(\psi (t,x,y)\) such that \(\psi \) satisfies all properties listed in (i)–(iii) in Lemma 4.1. Hence, we get that

$$\begin{aligned} {\mathrm{B}}_2&= -2 {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ {\mathcal {D}}^*\big (\psi (g)(t,x)\big ){dm_\lambda }(x)dt\\&= 2{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ {\partial }_x(\psi (g))(t, x)\ x^{2\lambda }\, dxdt\\&\quad + 2{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ \frac{2\lambda }{x}\psi (g)(t, x){dm_\lambda }(x)dt\\&=: 2{\mathcal {B}}_{21}+ 2{\mathcal {B}}_{22}. \end{aligned}$$

For \({\mathcal {B}}_{21}\), integration by parts gives

$$\begin{aligned} {\mathcal {B}}_{21}= & {} -{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\partial }_x \bigg ( u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\bigg ) \psi (g)(t, x) dxdt\\&+ \int _{{\mathbb {R}}_+} u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\bigg |_{x=0}^{x=\infty }dt\\= & {} - 2{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x) {\partial }_x u(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\ \psi (g)(t, x) dxdt\\&-{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ {\partial }_x \big (\phi _t\sharp _\lambda g(x)\big ) \ x^{2\lambda }\ \psi (g)(t, x)dxdt\\&-{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ 2\lambda \,x^{2\lambda -1}\ \psi (g)(t, x)dxdt\\&+ \int _{{\mathbb {R}}_+} u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\bigg |_{x=0}^{x=\infty }dt, \end{aligned}$$

where the third term on the right-hand side equals \(- {\mathcal {B}}_{22}\). Hence,

$$\begin{aligned} {\mathrm{B}}_2/2&= - 2{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u(t,x) {\partial }_x u(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\ \psi (g)(t, x)dxdt\\&\quad -{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}u^2(t,x)\ {\partial }_x \big (\phi _t\sharp _\lambda g(x)\big ) \ x^{2\lambda }\ \psi (g)(t, x)dxdt\\&\quad + \int _{{\mathbb {R}}_+} u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\bigg |_{x=0}^{x=\infty }dt\\&=: {\mathrm{B}}_{21}+ {\mathrm{B}}_{22}+{\mathrm{B}}_{23}. \end{aligned}$$

For the term \({\mathrm{B}}_{23}\), we first note that

$$\begin{aligned} {\mathrm{B}}_{23}&= - \int _{{\mathbb {R}}_+} \lim _{x\rightarrow 0} \Big ( u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\Big ) \, dt\\&\quad + \int _{{\mathbb {R}}_+} \lim _{x\rightarrow +\infty } \Big ( u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\Big ) \, dt\\&=: {\mathrm{B}}_{231}+ {\mathrm{B}}_{232}. \end{aligned}$$

We claim that

$$\begin{aligned} {\mathrm{B}}_{231}={\mathrm{B}}_{232}=0. \end{aligned}$$
(4.22)

We first consider \({\mathrm{B}}_{231}\). By (4.2), \({\mathrm{(K_{i}})}\) and (4.3), we see that for all t and x,

$$\begin{aligned} \Vert \psi (t, x, \cdot )\Vert _{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\lesssim 1, \end{aligned}$$
(4.23)

for all t, x and y,

$$\begin{aligned} |\psi (t, x, y)|\lesssim \frac{x}{t^{2\lambda +2}}. \end{aligned}$$
(4.24)

Thus, we have that

$$\begin{aligned} \lim _{x\rightarrow 0}|\psi (g)(t, x)|\le & {} \lim _{x\rightarrow 0}\Vert \psi (t, x, \cdot )\Vert ^\frac{1}{2}_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\left[{\int _0^\infty }|\psi (t, x,y)|g(y)|^2{dm_\lambda }(y)\right]^{\frac{1}{2}}\\\lesssim & {} \lim _{x\rightarrow 0} x^\frac{1}{2}\Vert g\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}=0. \end{aligned}$$

Therefore, we obtain that

$$\begin{aligned} \lim _{x\rightarrow 0} \Big | u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\Big |=0, \end{aligned}$$

which gives that \({\mathrm{B}}_{231}=0\). Next we verify the term \({\mathrm{B}}_{232}\). Moreover, from (4.6), (4.23), (4.24) and the Hölder inequality, we deduce that for t and x,

$$\begin{aligned} |\psi (g)(t, x)|\le & {} \Vert \psi (t, x, \cdot )\Vert ^\frac{1}{2}_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\left[{\int _0^\infty }|\psi (t,x, y)||g(y)|^2{dm_\lambda }(y)\right]^{\frac{1}{2}}\\\lesssim & {} x^{-\lambda }t^{-1/2}\Vert g\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}. \end{aligned}$$

By these and the fact that for each x,

$$\begin{aligned} |\phi _t\sharp _\lambda g(x)|\le & {} \Vert \phi _t\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\Vert g\Vert _{L^2({{\mathbb {R}}}_+,\, dm_\lambda )}, \end{aligned}$$

we obtain that

$$\begin{aligned} \lim _{x\rightarrow +\infty } \Big | u^2(t,x)\ \phi _t\sharp _\lambda g(x)\ x^{2\lambda }\psi (g)(t, x)\Big | =0, \end{aligned}$$

which implies that

$$\begin{aligned} {\mathrm{B}}_{232}=0. \end{aligned}$$

Hence, the claim (4.22) holds.

Similar to the estimate for the term \({\mathrm{A}}\), as for the term \({\mathrm{B}}_{21}\), using Hölder’s inequality and Cauchy’s inequality, we obtain that

$$\begin{aligned} |{\mathrm{B}}_{21}|\le & {} {1\over 8} {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|{\nabla _{t,\,x}}u(t,x)|^2 |\phi _t\sharp _\lambda g(x)|^2\ t {dm_\lambda }(x)dt \nonumber \\&+C{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t,x)|^2 |\psi (g)(t, x)|^2\ {{dm_\lambda }(x)dt\over t}\nonumber \\= & {} {{\mathrm{I}}\over 8} +C{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t,x)|^2 |\psi (g)(t, x)|^2\ {{dm_\lambda }(x)dt\over t}. \end{aligned}$$
(4.25)

Again, for the term \({\mathrm{B}}_{22}\) we have

$$\begin{aligned} |{\mathrm{B}}_{22}|\lesssim & {} {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t,x)|^2 \Big | t{\partial }_x\big (\phi _t\sharp _\lambda g(x)\big )\Big |^2\ {{dm_\lambda }(x)dt\over t} \nonumber \\&+{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t,x)|^2 |\psi (g)(t, x)|^2\ {{dm_\lambda }(x)dt\over t}. \end{aligned}$$
(4.26)

Combining the estimates of \({\mathrm{A}}\) and \({\mathrm{B}}\), (4.20), (4.21), (4.25), (4.26) and (4.22), and by moving the two terms \({{\mathrm{I}}\over 8}\) to the left-hand side, we obtain that

$$\begin{aligned} {\mathrm{I}}\lesssim & {} \int _{{\mathbb {R}}_+} f(x)^2g(x)^2 {dm_\lambda }(x) +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t,x)|^2|t\partial _x (\phi _t\sharp _\lambda g(x))|^2 {{dm_\lambda }(x)\,dt \over t}\\&+{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t,x)|^2|t\partial _t (\phi _t\sharp _\lambda g(x))|^2 {{dm_\lambda }(x)\,dt \over t}\\&+{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t,x)|^2 |\psi ( g)(t,x)|^2\ {{dm_\lambda }(x)dt\over t}. \end{aligned}$$

We now define

$$\begin{aligned} Q_t( g)(x) := \big ( t \partial _t (\phi _t\sharp _\lambda g(x)),\ t \partial _x (\phi _t\sharp _\lambda g(x)),\psi ( g)(t,x)\big ). \end{aligned}$$

Then we have

$$\begin{aligned} {\mathrm{I}}\lesssim & {} \int _{{\mathbb {R}}_+} f(x)^2g(x)^2 {dm_\lambda }(x) +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t,x)|^2 |Q_t( g)(x)|^2 {{dm_\lambda }(x)\,dt \over t}. \end{aligned}$$

This finishes the proof of Lemma 4.2. \(\square \)

Next we have the following result for the product case, which follows from the iteration of Lemma 4.2. Before stating our next Lemma, we introduce the notation \(\phi _{t_1}\sharp _{\lambda ,\,1} g(x_1, x_2)\), \(\phi _{t_2}\sharp _{\lambda ,\,2} g(x_1, x_2)\) and \(\phi _{t_1}\phi _{t_2}\sharp _{\lambda ,\,1,\,2} g(x_1, x_2)\) to denote the convolution with respect to the first, second and both variables, respectively.

Lemma 4.3

Let \(u(t_1, t_2, x_1, x_2):={P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f(x_1, x_2)\) and \(\phi \) be a smooth function as in Lemma 4.1. Then for any \(f,\,g\in L^2({\mathbb {R}}_\lambda )\) with \(fg\in L^2({\mathbb {R}}_\lambda )\), there exists a positive constant C such that

$$\begin{aligned} {{{{\tilde{\mathrm{I}}}}}}&:={\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t_1,\,x_1}}{\nabla _{t_2,\,x_2}}u(t_1, t_2, x_1, x_2)\right|^2 \\&\quad \times \left|\phi _{t_1}\phi _{t_2}\sharp _{\lambda ,\,1,\,2} g(x_1, x_2)\right|^2 t_1 t_2{\,d\mu _\lambda (x_1,x_2)\,dt_1\,dt_2}\\&\le C\bigg \{{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}[f(x_1, x_2)]^2 [g(x_1, x_2)]^2{\,d\mu _\lambda (x_1,x_2)}\\&\quad +{\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_2}}f(x_1, x_2)\right|^2\left[ Q^{(2)}_{t_2}(g)(x_1,x_2) \right]^2{\frac{{dm_\lambda }(x_2)\,dt_2}{t_2}}{\,dm_\lambda (x_1)}\\&\quad +{\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_1}}f(x_1, x_2)\right|^2 \left[ Q^{(1)}_{t_1}(g)(x_1,x_2) \right]^2\,{\frac{{dm_\lambda }(x_1)\,dt_1}{t_1}}{\,dm_\lambda (x_2)}\\&\quad +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t_1,t_2,x_1, x_2)|^2 \Big | Q^{(1)}_{t_1} Q^{(2)}_{t_2}(g)(x_1,x_2) \Big |^2 {{\,d\mu _\lambda (x_1,x_2)\,dt_1\,dt_2}\over t_1t_2}\bigg \}. \end{aligned}$$

Here the operator \(Q^{(1)}_{t_1}\) is defined as

$$\begin{aligned} Q^{(1)}_{t_1}(g)(x_1,x_2) := \big ( t_1 \partial _{t_1} (\phi _{t_1}\sharp _{\lambda ,\,1} g)(x_1,x_2),\ t_1 \partial _{x_1} (\phi _{t_1}\sharp _{\lambda ,\,1} g)(x_1,x_2), \psi ( g(\cdot ,x_2) )(t_1,x_1)\big ), \end{aligned}$$

where

$$\begin{aligned} \psi ( g(\cdot ,x_2) )(t_1,x_1):= \int _0^\infty \psi (t_1,x_1,y_1)g(y_1,x_2) {dm_\lambda }(y_1) \end{aligned}$$

is obtained from Lemma  4.1. Similarly, define

$$\begin{aligned} Q^{(2)}_{t_2}(g)(x_1,x_2) := \big ( t_2 \partial _{t_2} (\phi _{t_2}\sharp _{\lambda ,\,2} g)(x_1,x_2),\ t_2 \partial _{x_2} (\phi _{t_2}\sharp _{\lambda ,\,2} g)(x_1,x_2), \psi ( g(x_1, \cdot ) )(t_2,x_2)\big ),\nonumber \\ \end{aligned}$$
(4.27)

where

$$\begin{aligned} \psi ( g(x_1, \cdot ) )(t_2,x_2):= \int _0^\infty \psi (t_2,x_2,y_2)g(x_1,y_2) {dm_\lambda }(y_2). \end{aligned}$$

Proof

By Lemma 4.2 for \(t_1\) and \(x_1\) and \(S_{{\mathrm{(iii)}}}\) in Lemma 2.8, we have that

$$\begin{aligned} {{\tilde{\mathrm{I}}}}&={\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t_1,\,x_1}}{P^{[\lambda ]}_{t_1}}\Big ({\nabla _{t_2,\,x_2}}{P^{[\lambda ]}_{t_2}}f(\cdot , t_2, \cdot , x_2) \Big ) (t_1,x_1)\right|^2\\&\quad \times \left| \phi _{t_1} \sharp _{\lambda ,\,1} \Big (\phi _{t_2}\sharp _{\lambda ,\,2} g(\cdot , x_2) \Big )(x_1) \right|^2 t_1 \ t_2 {\,d\mu _\lambda (x_1,x_2)\,dt_1\,dt_2}\\&\lesssim {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\int _{{\mathbb {R}}_+}}\left|{\nabla _{t_2,\,x_2}}\left({P^{[\lambda ]}_{t_2}}f\right)(x_1, x_2)\right|^2\left| \left(\phi _{t_2}\sharp _{\lambda ,\,2} g\right)(x_1, x_2)\right|^2{\,dm_\lambda (x_1)}\ t_2{\,dm_\lambda (x_2)}\,dt_2\\&\quad +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t_2,\,x_2}}\left({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\right)(x_1, x_2)\right|^2\\&\quad \times \left[ Q^{(1)}_{t_1} ( \phi _{t_2}\sharp _{\lambda ,\,2} g(\cdot , x_2) )(x_1) \right]^2{\frac{{dm_\lambda }(x_1)\,dt_1}{t_1}}\,t_2{\,dm_\lambda (x_2)}\,dt_2\\&= {\int _{{\mathbb {R}}_+}}\ {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t_2,\,x_2}}\left({P^{[\lambda ]}_{t_2}}f(x_1, \cdot )\right)(x_2)\right|^2\left| \left(\phi _{t_2}\sharp _{\lambda ,\,2} g(x_1, \cdot )\right)(x_2)\right|^2\ t_2{\,dm_\lambda (x_2)}\,dt_2{\,dm_\lambda (x_1)}\\&\quad +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{\nabla _{t_2,\,x_2}}{P^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{t_1}}f(x_1, \cdot ) \right)( x_1, x_2)\right|^2\\&\quad \times \left[ \phi _{t_2}\sharp _{\lambda ,\,2}\big ( Q^{(1)}_{t_1} ( g(\cdot , \cdot ) )(x_1)\big )(x_2) \right]^2{\frac{{dm_\lambda }(x_1)\,dt_1}{t_1}}\,t_2{\,dm_\lambda (x_2)}\,dt_2\\&=:{{{\tilde{{\mathrm{I}}}}_1}+{{{\tilde{\mathrm{I}}}}_2}}, \end{aligned}$$

where in the last but second equality, we use the fact that the order of \(\sharp _{\lambda ,\,2}\) and \(Q^{(1)}_{t_1}\) can be changed since they are acting on different variables and \(Q^{(1)}_{t_1}\) is a linear integral operator.

Now we apply Lemma 4.2 to \({{ {{{\tilde{\mathrm{I}}}}}_1}}\) and see that

$$\begin{aligned} {{{\tilde{{\mathrm{I}}}}_1}}\lesssim & {} {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|f(x_1, x_2)|^2 |g(x_1, x_2)|^2{\,dm_\lambda (x_1)}{\,dm_\lambda (x_2)}\\&+{\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_2}}f(x_1, x_2)\right|^2\left[ Q^{(2)}_{t_2}(g)(x_1,x_2) \right]^2{\frac{{dm_\lambda }(x_2)\,dt_2}{t_2}}{\,dm_\lambda (x_1)}. \end{aligned}$$

Similarly, another application of Lemma 4.2 yields that

$$\begin{aligned} {{{\tilde{{\mathrm{I}}}}_2}}\lesssim & {} {\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_1}}f(x_1, x_2)\right|^2 \left[ Q^{(1)}_{t_1}(g)(x_1,x_2) \right]^2{\,dm_\lambda (x_2)}\,{\frac{{dm_\lambda }(x_1)\,dt_1}{t_1}}\\&+{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t_1,t_2,x_1, x_2)|^2 \Big | Q^{(1)}_{t_1} Q^{(2)}_{t_2}(g)(x_1,x_2) \Big |^2 {{\,d\mu _\lambda (x_1,x_2)\,dt_1\,dt_2}\over t_1t_2}. \end{aligned}$$

This finishes the proof of Lemma 4.3. \(\square \)

Proof of \(\Vert f\Vert _{H^p_{S_u}( {\mathbb {R}}_\lambda )} \le C \Vert f\Vert _{H^p_{{{\mathcal {N}}}_P}({\mathbb {R}}_\lambda )}\)

For any \(\alpha >0\) and \(f\in {L^2({\mathbb {R}}_\lambda )}\) satisfying \({{\mathcal {N}}}_P(f) \in {L^p({\mathbb {R}}_\lambda )}\), we define

$$\begin{aligned} {{\mathcal {A}}}(\alpha ):=\left\rbrace (x,y)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\ {{\mathcal {M}}_S}\big (\chi _{\{{{\mathcal {N}}}_P(f)>\alpha \}}\big )(x,y)<\frac{1}{200}\frac{1}{2^{4\lambda +2}}\right\lbrace . \end{aligned}$$

Here the maximal function \({{\mathcal {M}}_S}\) is defined as in (3.14). We first claim that

$$\begin{aligned}&\iint _{{{\mathcal {A}}}(\alpha )}S_u^2(f)(x_1,x_2){\,d\mu _\lambda (x_1,x_2)}\nonumber \\&\quad \le \iiiint _{R^{*}}\big | t_1t_2\nabla _{t_1,\,y_1}\nabla _{t_2,\,y_2} u(t_1,t_2,y_1,y_2) \big |^2\frac{d\mu _\lambda (y_1,y_2)\,dt_1dt_2}{t_1t_2}, \end{aligned}$$
(4.28)

where for \(t_1,\,t_2,\,y_1,\,y_2\in {\mathbb {R}}_+\), \(R(y_1, y_2, t_1, t_2):=I(y_1,t_1)\times I(y_2,t_2)\) and

$$\begin{aligned} R^{*}:=\Big \{(y_1,y_2,t_1,t_2):\ \displaystyle \frac{\mu _\lambda (\{ {{\mathcal {N}}}_P(f)>\alpha \}\cap R(y_1,y_2,t_1,t_2))}{\ \mu _\lambda (R(y_1,y_2,t_1,t_2))}<\frac{1}{200}\frac{1}{2^{4\lambda +2}}\Big \}. \end{aligned}$$

Indeed, observe that

$$\begin{aligned}&{\iint _{{{\mathcal {A}}}(\alpha )}S_u^2(f)(x_1,x_2){\,d\mu _\lambda (x_1,x_2)}}\\&\quad \le \iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}\iint _{{{\mathcal {A}}}(\alpha )\cap R(y_1, y_2, t_1, t_2)}\\&\qquad \big | \nabla _{t_1,\,y_1}\nabla _{t_2,\,y_2} u(t_1,y_1,t_2,y_2) \big |^2 t_1t_2\frac{{\,d\mu _\lambda (x_1,x_2)}\ d\mu _\lambda (y_1,y_2) dt_1dt_2}{m_\lambda (I(x_1,t_1))m_\lambda (I(x_2,t_2))}. \end{aligned}$$

For any \((y_1,y_2,t_1,t_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\times {\mathbb {R}}_+\times {\mathbb {R}}_+\) with \(R(y_1,y_2,t_1,t_2) \bigcap {{\mathcal {A}}}(\alpha )\not =\emptyset \), there exists some \((x_1,x_2)\in R(y_1,y_2,t_1,t_2) \bigcap {{\mathcal {A}}}(\alpha )\) such that

$$\begin{aligned} {{\mathcal {M}}_S}\big (\chi _{\{{{\mathcal {N}}}_P(f)>\alpha \}}\big )(x_1,x_2)<\frac{1}{200}\frac{1}{2^{4\lambda +2}}. \end{aligned}$$

Hence we have

$$\begin{aligned} \displaystyle \frac{\mu _\lambda \left(\{ {{\mathcal {N}}}_P(f)>\alpha \}\bigcap R(y_1,y_2,t_1,t_2)\right)}{\mu _\lambda (R(y_1,y_2,t_1,t_2))}<\frac{1}{200}\frac{1}{2^{4\lambda +2}}. \end{aligned}$$

Then by the fact that for any \(y_1\in I(x_1, t_1)\) and \(y_2\in I(x_2, t_2)\),

$$\begin{aligned} m_\lambda (I(x_1,t_1))\sim m_\lambda (I(y_1,t_1))\quad {\mathrm{and}}\quad m_\lambda (I(x_2,t_2))\sim m_\lambda (I(y_2,t_2)), \end{aligned}$$

we have (4.28) and the claim holds.

Let \( g(x,y):=\chi _{\{ {{\mathcal {N}}}_P(f)\le \alpha \}}(x,y)\) and \(\phi \in C^\infty ({\mathbb {R}}_+)\) such that \(\,{\mathrm {supp}\,}(\phi )\subset (0, 1)\), \(\phi \equiv 1\) on (0, 1/2] and \(0\le \phi (x)\le 1\) for all \(x\in {\mathbb {R}}_+\). Then for \((x_1,x_2,t_1,t_2)\in R^{*}\), we have

$$\begin{aligned}&\phi _{t_1}\phi _{t_2}\sharp _{\lambda ,\,1,\,2} g(x_1, x_2)\nonumber \\&\quad =\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+} \tau _{x_1}^{[\lambda ]}\tau _{x_2}^{[\lambda ]}(\phi _{t_1}\phi _{t_2})(y_1,y_2)\chi _{\{ {{\mathcal {N}}}_P(f)\le \alpha \}}(y_1,y_2) dm_\lambda (y_1)dm_\lambda (y_2) \nonumber \\&\quad =\iint _{ \{ {{\mathcal {N}}}_P(f)\le \alpha \} \cap R(x_1,\,x_2,\,t_1,\,t_2) } \tau _{x_1}^{[\lambda ]}\tau _{x_2}^{[\lambda ]}(\phi _{t_1}\phi _{t_2})(y_1,y_2) dm_\lambda (y_1)dm_\lambda (y_2) \nonumber \\&\quad \gtrsim \iint _{ \{ {{\mathcal {N}}}_P(f)\le \alpha \} \cap R(x_1,\,x_2,\,t_1/2,\,t_2/2) } \frac{1}{\mu _\lambda ( R(x_1,x_2,t_1,t_2) )}dm_\lambda (y_1)dm_\lambda (y_2)\nonumber \\&\quad \gtrsim 1, \end{aligned}$$
(4.29)

where the last inequality follows from the fact that

$$\begin{aligned}&\mu _\lambda (\{(y_1,y_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\, {{\mathcal {N}}}_P(f)(y_1,y_2)\le \alpha \} \cap R(x_1,x_2,t_1/2,t_2/2) ) \\&\quad \ge \mu _\lambda (\{(y_1,y_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\, {{\mathcal {N}}}_P(f)(y_1,y_2)\le \alpha \} \cap R(x_1,x_2,t_1,t_2))\\&\qquad - \mu _\lambda (R(x_1,x_2,t_1,t_2)\setminus R(x_1,x_2,t_1/2,t_2/2))\\&\quad \ge \mu _\lambda (R(x_1,x_2,t_1/2,t_2/2))- \frac{1}{200}\frac{1}{2^{4\lambda +2}}\mu _\lambda ( R(x_1,x_2,t_1,t_2) ) \\&\quad \gtrsim \mu _\lambda ( R(x_1,x_2,t_1,t_2) ). \end{aligned}$$

Combining (4.28) and (4.29), and then using Lemma 4.3, we have

$$\begin{aligned}&\iint _{{{\mathcal {A}}}(\alpha )}S_u^2(f)(x_1,x_2){\,d\mu _\lambda (x_1,x_2)}\\&\quad \lesssim \iiiint _{R^{*}}\big | t_1t_2\nabla _{t_1,\,y_1}\nabla _{t_2,\,y_2} u(t_1,t_2,y_1,y_2) \big |^2 \left| \phi _{t_1}\phi _{t_2}\sharp _{\lambda ,\,1,\,2} g(y_1, y_2) \right|^2\frac{d\mu _\lambda (y_1,y_2)\,dt_1dt_2}{t_1t_2}\\&\quad \lesssim \bigg \{{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}[f(x_1, x_2)]^2 [g(x_1, x_2)]^2{\,d\mu _\lambda (x_1,x_2)}\\&\qquad +{\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_2}}f(x_1, x_2)\right|^2\left| Q^{(2)}_{t_2}(g)(x_1,x_2) \right|^2{\frac{{dm_\lambda }(x_2)\,dt_2}{t_2}}{\,dm_\lambda (x_1)}\\&\qquad +{\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{P^{[\lambda ]}_{t_1}}f(x_1, x_2)\right|^2 \left| Q^{(1)}_{t_1}(g)(x_1,x_2) \right|^2\,{\frac{{dm_\lambda }(x_1)\,dt_1}{t_1}}{\,dm_\lambda (x_2)}\\&\qquad +{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}| u(t_1,t_2,x_1, x_2)|^2 \Big | Q^{(1)}_{t_1} Q^{(2)}_{t_2}(g)(x_1,x_2) \Big |^2 {{\,d\mu _\lambda (x_1,x_2)\,dt_1\,dt_2}\over t_1t_2}\bigg \}\\&\quad =: {\mathrm{I+II+III+IV}}. \end{aligned}$$

For the term \({\mathrm{I}}\), we have

$$\begin{aligned} {\mathrm{I}}= & {} \iint _{\{{{\mathcal {N}}}_P(f)\le \alpha \}}f^2(y_1, y_2)d\mu _\lambda (y_1,y_2)\le \iint _{\{{{\mathcal {N}}}_P(f)\le \alpha \}}|{{\mathcal {N}}}_P(f)(y_1, y_2)|^2d\mu _\lambda (y_1,y_2). \end{aligned}$$

For the term \({\mathrm{II}}\), we claim that: if \(|Q^{(2)}_{t_2}(g)(x_1,x_2)|\not =0\), then there exists some \(w_2\) such that \((x_1,w_2)\in \{{{\mathcal {N}}}_P(f)\le \alpha \}\), and satisfies \(|x_2-w_2|< t_2\). To see this, let \(Q^{(2)}_{t_2}(g)\) be as in (4.27). Hence, if \(|Q^{(2)}_{t_2}(g)(x_1,x_2)|\not =0\), then we have that one of the three terms \(\partial _{t_2} (\phi _{t_2}\sharp _{\lambda ,\,2} g)(x_1,x_2)\), \(\partial _{x_2} (\phi _{t_2}\sharp _{\lambda ,\,2} g)(x_1,x_2)\), \(\psi ( g(x_1, \cdot ) )(t_2,x_2)\) must be non-zero. Hence, there must be some \(w_2\) such that \((x_1,w_2)\) is in the support of the function g, and satisfies \(|x_2-w_2|< t_2\). This implies that the claim holds.

Then we get that \(|{P^{[\lambda ]}_{t_2}}f(x_1, x_2)|\le \alpha \). As a consequence,

$$\begin{aligned} {\mathrm{II}}\le & {} \alpha ^2 {\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left| Q^{(2)}_{t_2}(g(x_1, \cdot ))(x_2) \right|^2{\frac{{dm_\lambda }(x_2)\,dt_2}{t_2}}{\,dm_\lambda (x_1)}\nonumber \\= & {} \alpha ^2 {\int _{{\mathbb {R}}_+}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left| Q^{(2)}_{t_2}(1-g(x_1, \cdot ))(x_2) \right|^2{\frac{{dm_\lambda }(x_2)\,dt_2}{t_2}}{\,dm_\lambda (x_1)}\nonumber \\\lesssim & {} \alpha ^2 {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|1- g(x_1,x_2)|^2 d\mu _\lambda (x_1,x_2) \nonumber \\\lesssim & {} \alpha ^2 \mu _\lambda (\{(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\,{{\mathcal {N}}}_P(f)(x_1,x_2)>\alpha \}), \end{aligned}$$
(4.30)

where the second inequality follows from the \(L^2({\mathbb {R}}_\lambda )\)-boundedness of the Littlewood–Paley square function estimate, i. e., (3.1) of Theorem 3.1. For the term \({\mathrm{III}}\), symmetrically, we can obtain the same estimate as term \({\mathrm{II}}\).

For the term \({\mathrm{IV}}\), if \(Q^{(1)}_{t_1} Q^{(2)}_{t_2}(g)(x_1,x_2)\not =0 \), then there exist some \((w_1,w_2)\) such that \((w_1,w_2)\in \{{{\mathcal {N}}}_P(f)\le \alpha \}\) and \(|x_1-w_1|<t_1\) and \(|x_2-w_2|<t_2\). Hence \(|u(t_1,t_2,x_1, x_2)|\le \alpha \). Following the same routine of (4.30), and using the \(L^2({\mathbb {R}}_\lambda )\)-boundedness of the product Littlewood–Paley square function, i. e., (3.2) of Theorem 3.1, we have

$$\begin{aligned} {\mathrm{IV}}\lesssim \alpha ^2\mu _\lambda \left(\left\rbrace (x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\,{{\mathcal {N}}}_P(f)(x_1,x_2)>\alpha \right\lbrace \right). \end{aligned}$$

Combining the four terms above, we have

$$\begin{aligned}&\iint _{{\mathcal {A}}(\alpha )}S_u^2(f)(x_1,x_2)\,d\mu _\lambda (x_1,x_2)\nonumber \\&\quad \lesssim \alpha ^2 \mu _\lambda (\{(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\,{{\mathcal {N}}}_P(f)(x_1,x_2)>\alpha \})\nonumber \\&\qquad + \iint _{\{{{\mathcal {N}}}_P(f)\le \alpha \}}|{{\mathcal {N}}}_P(f)(x_1,\,x_2)|^2{\,d\mu _\lambda (x_1,x_2)}. \end{aligned}$$
(4.31)

By the \(L^2({\mathbb {R}}_\lambda )\)-boundedness of the strong maximal function \({{\mathcal {M}}}_S\), we have

$$\begin{aligned} \mu _\lambda \big ({\mathbb {R}}_+\times {\mathbb {R}}_+\setminus {\mathcal {A}} (\alpha )\big )&\lesssim \iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+} \left[{{\mathcal {M}}}_S(\chi _{\{{{\mathcal {N}}}_P(f)>\alpha \}})(x_1,x_2)\right]^2 \,d\mu _\lambda (x_1,x_2)\nonumber \\&\lesssim \iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}\left[\chi _{\{{{\mathcal {N}}}_P(f)>\alpha \}}(x_1,x_2)\right]^2 \,d\mu _\lambda (x_1,x_2)\nonumber \\&\sim \mu _\lambda \big (\{(x_1,\,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\,{{\mathcal {N}}}_P(f)(x_1,\,x_2)>\alpha \}\big ). \end{aligned}$$
(4.32)

Combining (4.31) and (4.32), we have

$$\begin{aligned}&\mu _\lambda \Big (\{(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\ S_u(f)(x_1,x_2)>\alpha \}\Big )\\&\quad \le \mu _\lambda \Big ( \{(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\ S_u(f)(x_1,x_2)>\alpha \} \cap {\mathcal {A}} (\alpha )\Big ) \\&\qquad + \mu _\lambda \Big ( \{(x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\ S_u(f)(x_1,x_2)>\alpha \} \setminus {\mathcal {A}} (\alpha )\Big )\\&\quad \lesssim \mu _\lambda \big (\{(x_1,\,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+:\,\,{{\mathcal {N}}}_P(f)(x_1,\,x_2)>\alpha \}\big )\\&\qquad + \frac{1}{\alpha ^2}\iint _{\{{{\mathcal {N}}}_P(f)\le \alpha \}}|{{\mathcal {N}}}_P(f)(x_1,x_2)|^2\,{\,d\mu _\lambda (x_1,x_2)}, \end{aligned}$$

which via a standard argument shows that \(\Vert S_u(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert {{\mathcal {N}}}_P(f)\Vert _{L^p({\mathbb {R}}_\lambda )}\). \(\square \)

Step 3: \(\Vert f\Vert _{H^p_{{{\mathcal {N}}}_P}( {\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{{\mathcal {R}}_P}({\mathbb {R}}_\lambda )}\) for \(f\in H^p_{{\mathcal {R}}_P}({\mathbb {R}}_\lambda ) \cap L^2({\mathbb {R}}_\lambda )\).

We first define the product grand maximal functions, borrowing an idea from [81] in the one-parameter setting (see also [37, 38]).

Definition 4.4

Let \(\beta _1,\beta _2,\gamma _1,\gamma _2\in (0,1]\). For any \(f\in \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\big )'\), we define the product grand maximal function as follows. For \((x_1,x_2)\in {\mathbb {R}}_+\times {\mathbb {R}}_+\),

$$\begin{aligned} G_{\beta _1,\,\beta _2,\,\gamma _1,\,\gamma _2}(f)(x_1,x_2):= \sup \big \{ \langle f,\varphi _1\varphi _2\rangle : \ \Vert \varphi _i\Vert _{{\mathcal {G}}(x_i,\,r_i,\,\beta _i,\,\gamma _i)}\le 1, r_i>0,\ i=1,2 \big \}. \end{aligned}$$
(4.33)

By the definition of \({{\mathcal {N}}}_{P}f\), we have

$$\begin{aligned} {{\mathcal {N}}}_{P}f(x_1, x_2)=\sup _{\genfrac{}{}{0.0pt}{}{|y_1-x_1|< t_1}{|y_2-x_2|< t_2}}\!\left| \int _{{\mathbb {R}}_+}\int _{{\mathbb {R}}_+} P_{t_1}^{[\lambda ]}(y_1, z_1)P_{t_2}^{[\lambda ]}(y_2, z_2) f(z_1, z_2) d\mu _\lambda (z_1,z_2)\right|. \end{aligned}$$

Next, for \(i=1,2\), for any fixed \(y_i,t_i\in {\mathbb {R}}_+\), the Poisson kernel \(P_{t_i}^{[\lambda ]}(y_i, z_i)\), as a function of \(z_i\), satisfies the conditions (\({\mathrm{K_{i}}}\)) and (\({\mathrm{K_{ii}}}\)) as in Sect. 3.1, and hence, it is a test function of the type \((y_i,\,t_i,\,1,\,1)\), with the norm \(\big \Vert P_{t_i}^{[\lambda ]}(y_i, \cdot )\big \Vert _{{\mathcal {G}}(y_i,\,t_i,\,1,\,1)}=: C_\lambda \), where \(C_\lambda \) is a positive constant depending only on \(\lambda \) (see Definition 2.2 for the test function and its norm). Hence, it is a test function of the type \((y_i,\,t_i,\,\beta _i,\,\gamma _i)\) for every \(\beta _i,\gamma _i\in (0,1]\) with the norm \( C_\lambda \). Moreover, for any \(x_i\) with \(|x_i-y_i|<t_i\), we have that \(\big \Vert P_{t_i}^{[\lambda ]}(y_i, \cdot )\big \Vert _{{\mathcal {G}}(x_i,\,t_i,\,\beta _i,\,\gamma _i)}\lesssim C_\lambda \), where the implicit constant is independent of \(x_i,\,t_i,\,\beta _i\) and \(\gamma _i\).

Then, there exists a positive constant \(\widetilde{C}_\lambda \) such that for any \(x_1,\, x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} \sup _{|y_1-x_1|< t_1} \left\Vert{P}_{t_1}^{[\lambda ]}(y_1, \cdot )\right\Vert_{{\mathcal {G}}(x_1,\,t_1,\,\beta _1,\,\gamma _1)}= \sup _{|y_2-x_2|< t_2}\left\Vert{P}_{t_2}^{[\lambda ]}(y_2, \cdot )\right\Vert_{{\mathcal {G}}(x_2,\,t_2,\,\beta _2,\,\gamma _2)}=\widetilde{C}_\lambda . \end{aligned}$$

We then obtain that

$$\begin{aligned} {{\mathcal {N}}}_{P}f(x_1, x_2)\lesssim G_{\beta _1,\,\beta _2,\,\gamma _1,\,\gamma _2}(f)(x_1,x_2). \end{aligned}$$

Next we claim that

$$\begin{aligned} G_{\beta _1,\,\beta _2,\,\gamma _1,\,\gamma _2}(f)(x_1,x_2)\lesssim \big \{ {\mathcal {M}}_1{\mathcal {M}}_2( |{\mathcal {R}}_P(f)|^r )(x_1,x_2) \big \}^{1\over r} \end{aligned}$$
(4.34)

for any \(r\in ({2\lambda +1\over 2\lambda +2}, p)\) and \(f\in H^p_{{\mathcal {R}}_P}({\mathbb {R}}_\lambda ) \cap L^2({\mathbb {R}}_\lambda )\), where \({\mathcal {M}}_1(f)\) and \( {\mathcal {M}}_2(f)\) are as in (3.8) and (3.9), respectively.

This shows

$$\begin{aligned} {{\mathcal {N}}}_{P}f(x_1, x_2)\lesssim \big \{ {\mathcal {M}}_1{\mathcal {M}}_2( |{\mathcal {R}}_P(f)|^r )(x_1,x_2) \big \}^{1\over r}, \end{aligned}$$

which via the boundedness of \({\mathcal {M}}_1\) and \({\mathcal {M}}_2\) on \(L^{p/r}({\mathbb {R}}_\lambda )\) implies our Step 3.

To prove (4.34), we first prove the following inequality:

$$\begin{aligned} |\langle f,\psi _1\psi _2\rangle |&\lesssim \Bigg [ {\mathcal {M}}_1\bigg ( {\mathcal {M}}_2\Big ( \left|{\mathcal {R}}_P(f)\right|^r \Big ) \bigg )(x_1,x_2) \Bigg ]^{1\over r} \end{aligned}$$
(4.35)

for any \(r\in ( {2\lambda +1\over 2\lambda +2}, p)\), \(f\in H^p_{{\mathcal {R}}_P}({\mathbb {R}}_\lambda ) \cap L^2({\mathbb {R}}_\lambda )\), \(\psi _1\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _1,\gamma _1)\) with \(\Vert \psi _1\Vert _{{\mathcal {G}}(x_1,\,2^{-\widetilde{k}_1},\,\beta _1,\,\gamma _1)}\le 1\) and \(\psi _2\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _2,\gamma _2)\) with \(\Vert \psi _2\Vert _{{\mathcal {G}}(x_2,\,2^{-\widetilde{k}_2},\,\beta _2,\,\gamma _2)}\le 1\), where \(\widetilde{k}_i := \lfloor \log _2 t_i \rfloor +1\) for \(i=1,2\).

To see this, consider the following approximations to the identity. For each \(k\in {\mathbb {Z}}\), define the operator

$$\begin{aligned} P_k:= P^{[\lambda ]}_{2^{-k}} \end{aligned}$$
(4.36)

with the kernel \( P_k(x,y):= P^{[\lambda ]}_{2^{-k}}(x,y). \) Then, it is easy to see that for every \(f\in L^2({\mathbb {R}}_+,{dm_\lambda })\),

$$\begin{aligned} \lim _{k\rightarrow +\infty } P_k(f) = \lim _{k\rightarrow +\infty }P^{[\lambda ]}_{2^{-k}}(f)= f \quad {\mathrm{and }}\quad \lim _{k\rightarrow -\infty } P_k(f) = \lim _{k\rightarrow -\infty }P^{[\lambda ]}_{2^{-k}}(f)= 0 \end{aligned}$$

in the sense of \(L^2({\mathbb {R}}_+,{dm_\lambda })\). Moreover, based on size and smoothness conditions of the Poisson kernel \(P^{[\lambda ]}_t(x,y)\), it is direct that \(P_k(x,y)\) satisfies the size and smoothness conditions as in \(({\mathrm{A_i}})\), \(({\mathrm{A_{ii}}})\) and \(({\mathrm{A_{iii}}})\) in Definition 2.1 for xy with a certain positive constant \({{\overline{C}}}_\lambda \).

Also, from \({\mathrm{(S_{iii}})}\) in Lemma 2.8, we have that for any \(k\in {{\mathbb {Z}}}\) and \(x\in {\mathbb {R}}_+\),

$$\begin{aligned} \int _{{\mathbb {R}}_+}P_k(x,y) {dm_\lambda }(y)=\int _{{\mathbb {R}}_+}P_k(y,x) {dm_\lambda }(y)=1. \end{aligned}$$

Hence, \(\{P_k\}_{k\in {\mathbb {Z}}}\) is an approximation to the identity as in Definition 2.1. Then we set \(Q_k:=P_k-P_{k-1}\) as the difference operator, and it is obvious that the kernel \(Q_k(x,y)\) of \(Q_k\) satisfies the same size and smoothness conditions as \(P_k(x,y)\) does, and

$$\begin{aligned} \int _{{\mathbb {R}}_+}Q_k(x,y) {dm_\lambda }(y)=\int _{{\mathbb {R}}_+}Q_k(y,x) {dm_\lambda }(y)=0. \end{aligned}$$

Now to classify the action on different variables, for \(i=1,2\), we let \(\left\{ P_{k_i}^{(i)}\right\} _{k_i\in {\mathbb {Z}}}\) be the approximation to the identity on the ith variable as defined above, and similarly let \(Q_{k_i}^{(i)}\) be the corresponding difference operator.

Then, by [43, Theorem 2.9], we have the following Calderón reproducing formula:

$$\begin{aligned} f(x_1,x_2)&= \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} m_\lambda (I_{\alpha _1}^{k_1})m_\lambda (I_{\alpha _2}^{k_2}) \nonumber \\&\qquad \times {\tilde{Q}}^{(1)}_{k_1}(x_1,x_{I_{\alpha _1}^{k_1}}){\tilde{Q}}^{(2)}_{k_2} (x_2,x_{I_{\alpha _2}^{k_2}})Q^{(1)}_{k_1}Q^{(2)}_{k_2}(f) (x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}}), \end{aligned}$$
(4.37)

where the series converges in the sense of \( \big ({\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_{1,\,1}(\beta _1,\beta _2;\gamma _1,\gamma _2)\big )'\), and for \(i:=1,2\), \({\tilde{Q}}^{(i)}_{k_i}\) satisfies the same size, smoothness and cancellation conditions as \(Q^{(i)}_{k_i}\) does, \({\mathscr {X}}^{k}\) is as in Sect. 2, and \(x_{I_{\alpha _1}^{k_1}}\), \(x_{I_{\alpha _2}^{k_2}}\) are arbitrary points in the dyadic intervals \(I_{\alpha _1}^{k_1}\) and \(I_{\alpha _2}^{k_2}\), respectively.

We now prove (4.35). To begin with, for any \(f\in H^p_{{\mathcal {R}}_P}({\mathbb {R}}_\lambda ) \cap L^2({\mathbb {R}}_\lambda )\) and \(\psi _1\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _1,\gamma _1)\) and with \(\Vert \psi _1\Vert _{{\mathcal {G}}(x_1,\,2^{-\widetilde{k}_1},\,\beta _1,\,\gamma _1)}\le 1\) and \(\psi _2\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _2,\gamma _2)\) with \(\Vert \psi _2\Vert _{{\mathcal {G}}(x_2,\,2^{-\widetilde{k}_1},\,\beta _2,\,\gamma _2)}\le 1\), from (4.37) we obtain that

$$\begin{aligned} \langle f, \psi _1\psi _2\rangle&= \sum _{k_1}\sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} \sum _{k_2}\sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} m_\lambda (I_{\alpha _1}^{k_1})m_\lambda (I_{\alpha _2}^{k_2}) \\&\quad \times {\tilde{Q}}^{(1)}_{k_1}(\psi _1)(x_{I_{\alpha _1}^{k_1}}) {\tilde{Q}}^{(2)}_{k_2}(\psi _2)(x_{I_{\alpha _2}^{k_2}})Q^{(1)}_{k_1} Q^{(2)}_{k_2}(f)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}}). \end{aligned}$$

Next, recall again the following almost orthogonality estimate (see [43, Lemma 2.11]): for \(\epsilon \in (0, 1)\),

$$\begin{aligned}&\bigg |{\tilde{Q}}^{(1)}_{k_1}(\psi _1)(x_{I_{\alpha _1}^{k_1}}) {\tilde{Q}}^{(2)}_{k_2}(\psi _2)(x_{I_{\alpha _2}^{k_2}})\bigg |\,\lesssim \prod _{i=1}^2 2^{-|k_i-\widetilde{k}_i|\epsilon } \bigg (\frac{2^{-k_i}+2^{-\widetilde{k}_i}}{|x_i-x_{I_{\alpha _i}^{k_i}}| +2^{-k_i}+2^{-\widetilde{k}_i}}\bigg )^\epsilon \nonumber \\&\qquad \times \frac{1}{m_\lambda (I(x_i, 2^{-k_i}+2^{-\widetilde{k}_i} )) +m_\lambda (I(x_{I_{\alpha _i}^{k_i}}, 2^{-k_i}+2^{-\widetilde{k}_i})) +m_\lambda (I(x_i, |x_i-x_{I_{\alpha _i}^{k_i}}|))}. \end{aligned}$$
(4.38)

For arbitrary dyadic intervals \(I_{\alpha _1}^{k_1}\) and \(I_{\alpha _2}^{k_2}\), we choose \(x_{I_{\alpha _1}^{k_1}}\in I_{\alpha _1}^{k_1}\) and \(x_{I_{\alpha _2}^{k_2}}\in I_{\alpha _2}^{k_2}\) such that

$$\begin{aligned} \left|Q^{(1)}_{k_1}Q^{(2)}_{k_2}(g)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}})\right| \le 2\inf _{z_1\in I_{\alpha _1}^{k_1},\, z_2\in I_{\alpha _2}^{k_2}}\left|Q^{(1)}_{k_1}Q^{(2)}_{k_2}(g)(z_{1},z_{2})\right|, \end{aligned}$$

which implies that

$$\begin{aligned} \left|Q^{(1)}_{k_1}Q^{(2)}_{k_2}(g)(x_{I_{\alpha _1}^{k_1}},x_{I_{\alpha _2}^{k_2}})\right|&\le 2\inf _{z_1\in I_{\alpha _1}^{k_1},\, z_2\in I_{\alpha _2}^{k_2}}\Big (\big |P^{(1)}_{k_1}P^{(2)}_{k_2}(g)(z_{1},z_{2}) \big |+\big |P^{(1)}_{k_1-1}P^{(2)}_{k_2}(g)(z_{1},z_{2})\big |\\&\quad +\big |P^{(1)}_{k_1}P^{(2)}_{k_2-1}(g)(z_{1},z_{2})\big | +\big |P^{(1)}_{k_1-1}P^{(2)}_{k_2-1}(g)(z_{1},z_{2})\big |\Big )\\&\le 8\inf _{z_1\in I_{\alpha _1}^{k_1},\, z_2\in I_{\alpha _2 }^{k_2}}{\mathcal {R}}_P(f)(z_1,z_2). \end{aligned}$$

Then, we further have the following estimate:

$$\begin{aligned} |\langle f, \psi \rangle |&\lesssim \sum _{k_1} \sum _{k_2} 2^{-|k_1-\widetilde{k}_1|\epsilon }2^{-| k_2-\widetilde{k}_2|\epsilon } 2^{[(k_1\wedge \widetilde{k}_1)-k_1](2\lambda +1) (1-{1\over r})}2^{[(k_2\wedge \widetilde{k}_2)-k_2](2\lambda +1)(1-{1\over r})}\\&\quad \times \Bigg [ {\mathcal {M}}_1\bigg ( \sum _{\alpha _1 \in {\mathscr {X}}^{k_1+N_1}} {\mathcal {M}}_2\Big ( \sum _{\alpha _2 \in {\mathscr {X}}^{k_2+N_2}} \inf _{\genfrac{}{}{0.0pt}{}{z_1\in I_{\alpha _1}^{k_1}}{z_2\in I_{\alpha _2}^{k_2}}}\left|{\mathcal {R}}_P(f)(z_1,z_2)\right|^r \chi _{I_{\alpha _2}^{k_2}}(\cdot )\Big )(x_2) \chi _{I_{\alpha _1}^{k_1}}(\cdot ) \bigg )(x_1) \Bigg ]^{1\over r}\\&\lesssim \Bigg [ {\mathcal {M}}_1\bigg ( {\mathcal {M}}_2\Big ( \left|{\mathcal {R}}_P(f)\right|^r \Big ) \bigg )(x_1,x_2) \Bigg ]^{1\over r}, \end{aligned}$$

where \({2\lambda +1\over 2\lambda +2}<r<p\) and \(a\wedge b:= \min \{a,b\}\), which shows that (4.35) holds.

We now prove (4.34). For every \(\varphi :=\varphi _1\varphi _2\) with \(\Vert \varphi _i\Vert _{{\mathcal {G}}(x_i,\,t_i,\,\beta _i,\,\gamma _i)}\le 1,\ i=1,2\), let

$$\begin{aligned} \sigma _1:=\int _{{\mathbb {R}}_+}\varphi _1(x_1){dm_\lambda }(x_1),\, \sigma _2:=\int _{{\mathbb {R}}_+}\varphi _2(x_2){dm_\lambda }(x_2). \end{aligned}$$

It is obvious that \(|\sigma _1|, |\sigma _2|\lesssim 1\) since \(\varphi _i\in {\mathcal {G}}(x_i,\,t_i,\,\beta _i,\,\gamma _i)\) for \(i=1,2\). We set

$$\begin{aligned} \psi _{1,\,x_1}(y_1)&:={1\over 1+\sigma _1\widetilde{C}_\lambda }\left[ \varphi (y_1)- \sigma _1 P^{(1)}_{\widetilde{k}_1}(x_1,y_1) \right], \\ \psi _{2,x_2}(y_2)&:={1\over 1+\sigma _2\widetilde{C}_\lambda }\left[ \varphi (y_2)- \sigma _2 P^{(2)}_{\widetilde{k}_2}(x_2,y_2) \right]. \end{aligned}$$

Then, we see that \(\psi _{1,x_1}\in {\mathcal {G}}(x_1,\, 2^{-\widetilde{k}_1},\,\beta _1,\,\gamma _1)\) and \(\psi _{2,x_2}\in {\mathcal {G}}(x_2,\, 2^{-\widetilde{k}_2},\,\beta _2,\,\gamma _2)\). Based on the normalisation factor \({1\over 1+\sigma _1\widetilde{C}_\lambda }\), we obtain that

$$\begin{aligned} \Vert \psi _{1,x_2}\Vert _{{\mathcal {G}}(x_1,\, 2^{-\widetilde{k}_1},\,\beta _1,\,\gamma _1)}\le 1 \quad {\mathrm{and}}\quad \Vert \psi _{2,x_2}\Vert _{{\mathcal {G}}(x_2,\, 2^{-\widetilde{k}_2},\,\beta _2,\,\gamma _2)}\le 1. \end{aligned}$$

Moreover, we point out that

$$\begin{aligned} \int _{{\mathbb {R}}_+}\psi _{1,x_1}(y_1){dm_\lambda }(y_1)=0 \quad {\mathrm{for\ all\ }}\ x_1\in {\mathbb {R}}_+ \end{aligned}$$

since \(\int _{{\mathbb {R}}_+} P^{(1)}_{\widetilde{k}_1}(x_1,y_1) {dm_\lambda }(y_1)=1 \quad {\mathrm{for\ all\ }} x_1\in {\mathbb {R}}_+\). Similarly we have the cancellation property for \(\psi _{2,x_2}(y_2)\). Hence, we further obtain that \(\psi _{1,x_1}\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _1,\gamma _1)\) and \(\psi _{2,x_2}\in {\mathop {{{\mathcal {G}}}}\limits ^{\circ }}_1(\beta _2,\gamma _2)\).

Based on the definition of \(\psi _{1,x_1}\) and \(\psi _{2,x_2}\), we have

$$\begin{aligned} |\langle f, \varphi \rangle |&=\bigg |\bigg \langle f, \left[\sigma _1 P_{\widetilde{k}_1}(x_1,\cdot )+(1+\sigma _1\,\widetilde{C}_\lambda )\psi _{1,x_1}(\cdot )\right]\left[\sigma _2P_{\widetilde{k}_2}(x_2,\cdot )+(1+\sigma _2\,\widetilde{C}_\lambda )\psi _{2,x_2}(\cdot )\right] \bigg \rangle \bigg |\\&\le \bigg |\bigg \langle f, \sigma _1\sigma _2P_{\widetilde{k}_1}(x_1,\cdot )P_{\widetilde{k}_2}(x_2,\cdot ) \bigg \rangle \bigg |+\bigg |\bigg \langle f, (1+\sigma _1\,\widetilde{C}_\lambda )\psi _{1,x_1}(\cdot )\sigma _2P_{\widetilde{k}_2}(x_2,\cdot ) \bigg \rangle \bigg |\\&\quad +\bigg |\bigg \langle f, \sigma _1P_{\widetilde{k}_1}(x_1,\cdot )(1+\sigma _2,\widetilde{C}_\lambda )\psi _{2,x_2}(\cdot ) \bigg \rangle \bigg |\\&\quad +\bigg |\bigg \langle f, (1+\sigma _1\,\widetilde{C}_\lambda )\psi _{1,x_1}(\cdot )(1+\sigma _2\,\widetilde{C}_\lambda )\psi _{2,x_2}(\cdot ) \bigg \rangle \bigg |\\&=:A_1+A_2+A_3+A_4. \end{aligned}$$

For the term \(A_1\), from the definition of \({\mathcal {R}}_P(f)\) in Sect. 1, we get that

$$\begin{aligned} A_1\lesssim {\mathcal {R}}_P(f)(x_1,x_2) =\big \{ |{\mathcal {R}}_P(f)(x_1,x_2)|^r\big \}^{1\over r}\le \Bigg [ {\mathcal {M}}_1\bigg ( {\mathcal {M}}_2\Big ( \left|{\mathcal {R}}_P(f)\right|^r \Big ) \bigg )(x_1,x_2) \Bigg ]^{1\over r} \end{aligned}$$

for any \(r\in (0,1]\). For the term \(A_4\), from (4.35) we obtain that

$$\begin{aligned} A_4\lesssim \Bigg [ {\mathcal {M}}_1\bigg ( {\mathcal {M}}_2\Big ( \left|{\mathcal {R}}_P(f)\right|^r \Big ) \bigg )(x_1,x_2) \Bigg ]^{1\over r} \end{aligned}$$

for \({2\lambda +1\over 2\lambda +2}<r<p\). As for \(A_2\), let \(F_{x_2}(\cdot ):=\langle f, P_{\widetilde{k}_2}(x_2,\cdot ) \rangle \). Then we have

$$\begin{aligned} A_2&\sim \bigg |\Big \langle F_{x_2}(\cdot ), (1+\sigma \,\widetilde{C}_\lambda )\psi _{1,x_1}(\cdot ) \Big \rangle \bigg |. \end{aligned}$$

Then, following the same approach above, by using the reproducing formula in terms of \(Q^{(2)}_{k_2}\), the almost orthogonality estimate, we obtain that

$$\begin{aligned} A_2&\lesssim \Bigg [ {\mathcal {M}}_1\bigg ( \Big [\sup _{t_1>0}\left|P_{t_1}^{[\lambda ]}(F_{x_2}(\cdot ))\right|\Big ]^r \bigg )(x_1) \Bigg ]^{1\over r}\\&= \Bigg [ {\mathcal {M}}_1\bigg ( \Big [\sup _{t_1>0}\left|P_{t_1}^{[\lambda ]}P_{2^{-\widetilde{k}_2}}^{[\lambda ]}(f)(\cdot ,x_2)\right|\Big ]^r \bigg )(x_1) \Bigg ]^{1\over r}, \end{aligned}$$

which is further bounded by

$$\begin{aligned} \Bigg [ {\mathcal {M}}_1\bigg ( {\mathcal {M}}_2\Big ( \left|{\mathcal {R}}_P(f)\right|^r \Big ) \bigg )(x_1,x_2) \Bigg ]^{1\over r}. \end{aligned}$$

Similarly, we obtain that \(A_3\) satisfies the same estimates. Combining the estimates of \(A_1\), \(A_2\), \(A_3\) and \(A_4\), we obtain that (4.34) holds.

Step 4: \(\Vert f\Vert _{H^p_{{{\mathcal {R}}}_{P}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\) for \(f\in {H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\cap L^2({\mathbb {R}}_\lambda )\).

Indeed, we recall the well-known subordination formula that for all \(f\in L^2({\mathbb {R}}_\lambda )\),

$$\begin{aligned} {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f(x_1, x_2)=\frac{1}{\pi }{\int _0^\infty \int _0^\infty }\frac{e^{-u_1}}{\sqrt{u_1}}\frac{e^{-u_2}}{\sqrt{u_2}}e^{-\frac{t_1^2}{4u_1} \Delta _\lambda }e^{-\frac{t_2^2}{4u_2}\Delta _\lambda }f(x_1, x_2)\,du_1du_2.\nonumber \\ \end{aligned}$$
(4.39)

From this, it follows that

$$\begin{aligned} {{\mathcal {R}}}_Pf(x_1, x_2)\lesssim & {} {\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\int _0^\infty \int _0^\infty }\frac{e^{-u_1}}{\sqrt{u_1}}\frac{e^{-u_2}}{\sqrt{u_2}} \left|e^{-\frac{t_1^2}{4u_1}\Delta _\lambda }e^{-\frac{t_2^2}{4u_2}\Delta _\lambda }f(x_1, x_2)\right|\,du_1du_2\\\lesssim & {} {{\mathcal {R}}}_hf(x_1,x_2){\int _0^\infty \int _0^\infty }\frac{e^{-u_1}}{\sqrt{u_1}}\frac{e^{-u_2}}{\sqrt{u_2}}\,du_1du_2\\\lesssim & {} {{\mathcal {R}}}_hf(x_1,x_2), \end{aligned}$$

which further implies that for all \(f\in {H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\cap {L^2({\mathbb {R}}_\lambda )}\), \(\Vert f\Vert _{H^p_{{{\mathcal {R}}}_{P}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}.\)

Step 5: \(\Vert f\Vert _{H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\le \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}\) for \(f\in {H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}\cap L^2({\mathbb {R}}_\lambda )\).

Observe that for all \(f\in {L^2({\mathbb {R}}_\lambda )}\), \({{\mathcal {R}}}_h f\le {{\mathcal {N}}}_h f\). Then we see that for all \(f\in {H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}\cap {L^2({\mathbb {R}}_\lambda )}\),

$$\begin{aligned} \Vert f\Vert _{H^p_{{{\mathcal {R}}}_{h}}({\mathbb {R}}_\lambda )}\le \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}. \end{aligned}$$

Step 6: \( \Vert f\Vert _{H^p_{{{\mathcal {N}}}_{h}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\) for \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\cap L^2({\mathbb {R}}_\lambda )\).

We claim that it suffices to prove that there exists a constant \(\delta >0\) such that for every rectangular atom \(\alpha _R\) as in Definition 2.6, and \({\gamma }_1,{\gamma }_2\ge 2\),

$$\begin{aligned} \int _{{\mathbb {R}}_+\setminus {\gamma }_1 I}{\int _0^\infty }|{{\mathcal {N}}}_h (\alpha _R)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\lesssim [\mu _\lambda (R)]^{1-\frac{p}{2}} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p {\gamma }_1^{-\delta }\nonumber \\ \end{aligned}$$
(4.40)

and

$$\begin{aligned} {\int _0^\infty }\int _{{\mathbb {R}}_+\setminus {\gamma }_2 J} |{{\mathcal {N}}}_h (\alpha _R)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\lesssim [\mu _\lambda (R)]^{1-\frac{p}{2}} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p {\gamma }_2^{-\delta }.\qquad \quad \end{aligned}$$
(4.41)

In fact, if (4.40) and (4.41) hold, then we can obtain that for every atom a of \(H^p({\mathbb {R}}_\lambda )\), we have

$$\begin{aligned} {\int _0^\infty }{\int _0^\infty }|{{\mathcal {N}}}_h(a)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\lesssim 1. \end{aligned}$$
(4.42)

To see this, suppose a is supported in an open set \(\Omega \subset {\mathbb {R}}_+\times {\mathbb {R}}_+\) with finite measure and \(a:=\sum _{R\in m(\Omega )}\alpha _R. \) We now define

$$\begin{aligned} {\widetilde{\Omega }}&:=\{ (x_1,x_2) \in {\mathbb {R}}_+\times {\mathbb {R}}_+: {{\mathcal {M}}_S}(\chi _\Omega )(x_1,x_2)>1/2 \},\\ \widetilde{{\widetilde{\Omega }}}&:=\{ (x_1,x_2) \in {\mathbb {R}}_+\times {\mathbb {R}}_+: {{\mathcal {M}}_S}(\chi _{{\widetilde{\Omega }}})(x_1,x_2)>1/2 \}, \end{aligned}$$

where the strong maximal function \({{\mathcal {M}}_S}\) is defined as in (3.14). Moreover, for every \(R:=I\times J \in m_1(\Omega )\) (see Sect. 2.1 for the definition of \(m_1(\Omega )\)), let \({\tilde{I}}\) be the largest dyadic interval containing I such that \({\tilde{R}}:={\tilde{I}}\times J\subset {\widetilde{\Omega }}\) and let \({\tilde{J}}\) be the largest dyadic interval containing J such that \(\tilde{{\tilde{R}}}:={\tilde{I}}\times {\tilde{J}}\subset \widetilde{{\widetilde{\Omega }}}\). We now let \(\gamma _1:={|\widetilde{I}|\over |I|}\) and \(\gamma _2:={|\widetilde{J}|\over |J|}\), where for an interval I, |I| is the Euclidean length of I. Then we have

$$\begin{aligned}&{\int _0^\infty }{\int _0^\infty }|{{\mathcal {N}}}_h(a)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&\quad =\Bigg (\iint _{\cup _{R\in m(\Omega )} 10 \tilde{{\tilde{R}}} } + \iint _{ \big (\cup _{R\in m(\Omega )} 10 \tilde{{\tilde{R}}}\big )^c }\Bigg ) |{{\mathcal {N}}}_h(a)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&\quad =:{\mathrm{A_1}}+{\mathrm{A}}_2. \end{aligned}$$

For the term \({\mathrm{A}}_1\), using Hölder’s inequality and the \({L^2({\mathbb {R}}_\lambda )}\)-boundedness of \({{\mathcal {N}}}_h\), we have that

$$\begin{aligned} {\mathrm{A}}_1&\le \Big [\mu _\lambda \big ( \bigcup _{R\in m(\Omega )} 10 \tilde{{\tilde{R}}} \big )\Big ]^{1-{p\over 2}} \Bigg (\iint _{\cup _{R\in m(\Omega )} 10 \tilde{{\tilde{R}}} } |{{\mathcal {N}}}_h(a)(x_1, x_2)|^2\,d\mu _\lambda (x_1,x_2)\Bigg )^{p\over 2}\\&\lesssim \Big [\mu _\lambda \big ( \Omega \big )\Big ]^{1-{p\over 2}} \Vert a\Vert _{L^2({\mathbb {R}}_\lambda )}^p\lesssim \Big [\mu _\lambda \big ( \Omega \big )\Big ]^{1-{p\over 2}} \Big [ \mu _\lambda \big ( \Omega \big )^{{1\over 2}-{1\over p}}\Big ]^p \lesssim 1. \end{aligned}$$

For the term \({\mathrm{A}}_2\), we have

$$\begin{aligned} {\mathrm{A}}_2&\le \sum _{R\in m(\Omega )} \iint _{ \big (10 \tilde{{\tilde{R}}}\big )^c } |{{\mathcal {N}}}_h(\alpha _R)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&\le \sum _{R\in m(\Omega )} \iint _{ ({\mathbb {R}}_+\setminus 10{\tilde{I}})\times {\mathbb {R}}_+ } |{{\mathcal {N}}}_h(\alpha _R)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&\quad + \sum _{R\in m(\Omega )} \iint _{ {\mathbb {R}}_+\times ({\mathbb {R}}_+\setminus 10{\tilde{J}}) } |{{\mathcal {N}}}_h(\alpha _R)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&=: {\mathrm{A}}_{21}+{\mathrm{A}}_{22}. \end{aligned}$$

Since \(\gamma _1 I\subset 10{{\tilde{I}}}\) and \(\gamma _1 J\subset 10{{\tilde{J}}}\), from (4.40) and (4.41), we have

$$\begin{aligned} {\mathrm{A}}_{21}&\lesssim \sum _{R\in m(\Omega )} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p\ \mu _\lambda (R)^{1-{p\over 2}} \gamma _1^{-\delta } \end{aligned}$$

and that

$$\begin{aligned} {\mathrm{A}}_{22}&\lesssim \sum _{R\in m(\Omega )}\Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p\ \mu _\lambda (R)^{1-{p\over 2}} \gamma _2^{-\delta }. \end{aligned}$$

As a consequence, using Hölder’s inequality and Journé’s covering lemma ([55, 72], see also the version on spaces of homogeneous type in [41]) we get that

$$\begin{aligned} {\mathrm{A}}_2&\lesssim \bigg (\sum _{R\in m(\Omega )} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^2\bigg )^{p\over 2} \Bigg (\sum _{R\in m(\Omega )} \mu _\lambda (R) \gamma _1^{-2\delta } \Bigg )^{1-{p\over 2}}\\&\quad + \bigg (\sum _{R\in m(\Omega )} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^2\bigg )^{p\over 2} \Bigg (\sum _{R\in m(\Omega )} \mu _\lambda (R) \gamma _2^{-2\delta } \Bigg )^{1-{p\over 2}}\\&\lesssim \mu _\lambda (\Omega )^{{p\over 2}-1}\ \mu _\lambda (\Omega )^{1-{p\over 2}}\lesssim 1. \end{aligned}$$

Combining the estimates of the two terms \({\mathrm{A}}_1\) and \({\mathrm{A}}_2\), we get that (4.42) holds.

Now, based on (4.42), for every \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\), we have that \( f=\sum _{j}\lambda _j a_j\) with \(\sum _j|\lambda _j|^p \sim \Vert f\Vert _{{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}}^p\). Hence,

$$\begin{aligned}&\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+} |{{\mathcal {N}}}_h(f)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\,\le \sum _j|\lambda _j|^p \iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+} |{{\mathcal {N}}}_h (a_j)(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2)\\&\quad \lesssim \sum _j|\lambda _j|^p \lesssim \Vert f\Vert _{{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}}^p. \end{aligned}$$

Thus, to prove Step 6, it suffices to prove (4.40) and (4.41). By symmetry, we only prove (4.40). To this end, we write

$$\begin{aligned}&\int _{{\mathbb {R}}_+\setminus {\gamma }I}{\int _0^\infty }|{{\mathcal {N}}}_h\alpha _R(x_1, x_2)|^p\,d\mu _\lambda (x_1,x_2) \\&\quad = \left[\sum _{k=0}^\infty \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\int _{8J} +\sum _{k=0}^\infty \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\int _{{\mathbb {R}}_+\setminus 8J}\right] |{{\mathcal {N}}}_h\alpha _R(x_1, x_2)|^p\,{\,d\mu _\lambda (x_1,x_2)}\\&\quad =:{\mathrm{F}}_1+{\mathrm{F}}_2. \end{aligned}$$

For \({\mathrm{F}}_1\), since

$$\begin{aligned} {{\mathcal {N}}}_h\alpha _R(x_1, x_2)\le \sup _{|x_2-y_2|<t_2}W^{[\lambda ]}_{t_2^2} \Big (\sup _{|x_1-y_1|<t_1}|W^{[\lambda ]}_{t_1^2} \alpha _R(y_1,\cdot )|\Big )(y_2), \end{aligned}$$

by Hölder’s inequality and the \({L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\)-boundedness of \(\sup _{|x_2-y_2|<t_2}|W^{[\lambda ]}_{t_2^2} f(y_2)|\), we have that

$$\begin{aligned} {\mathrm{F}}_1&\le \sum _{k=0}^\infty [m_\lambda (8J)]^{1-\frac{p}{2}} \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\left[\int _{8J}\left[{{\mathcal {N}}}_h\alpha _R(x_1, x_2)\right]^2{\,dm_\lambda (x_2)}\right]^{\frac{p}{2}}{\,dm_\lambda (x_1)}\\&\lesssim [m_\lambda (J)]^{1-\frac{p}{2}} \sum _{k=0}^\infty \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\left[\int _J\left[\sup _{|x_1-y_1|<t_1}\left|W^{[\lambda ]}_{t_1^2}\alpha _R(\cdot , x_2)(y_1)\right|\right]^2{\,dm_\lambda (x_2)}\right]^{\frac{p}{2}}{\,dm_\lambda (x_1)}. \end{aligned}$$

Since for any fixed \(x_2\), \({\int _0^\infty }\alpha _R(z_1, x_2){dm_\lambda }(z_1)=0\), we conclude that

$$\begin{aligned}&|W^{[\lambda ]}_{t_1^2}\alpha _R(\cdot , x_2)(y_1)|\\&\quad =\left|\int _{I}\left[W^{[\lambda ]}_{t_1^2}(y_1, z_1)-W^{[\lambda ]}_{t_1^2}(y_1, x^1_0)\right]\alpha _R(z_1, x_2){dm_\lambda }(z_1)\right|\\&\quad \lesssim \int _{I}\frac{t_1 |z_1-x_0^1||\alpha _R(z_1, x_2)| }{[m_\lambda (I(x^1_0, |y_1-x^1_0|))+m_\lambda (I(y_1, |y_1-x^1_0|))+m_\lambda (I(y_1, t))](|x^1_0-y_1|+t_1)^2}{dm_\lambda }(z_1), \end{aligned}$$

where \(x^1_0\) is the center of I, and the last inequality follows from the fact that \({W^{[\lambda ]}_{t_1}}(y_1, z_1)\) as a function of \(z_1\) satisfies \(({\mathrm{K_{ii}}})\) in Sect. 3.1, and that for \(z_1\in I, x_1\notin \gamma I, y_1, t_1\) such that \(|x_1-y_1|<t_1\),

$$\begin{aligned} |z_1-x_0^1|<\frac{|I|}{2}\le \frac{|x_0^1-y_1|+t_1}{2}. \end{aligned}$$

Thus, by \(|x_1-x^1_0|\le |x^1_0-y_1|+t_1,\) we have

$$\begin{aligned} \sup _{|x_1-y_1|<t_1}|W^{[\lambda ]}_{t_1^2}\alpha _R(\cdot , x_2)(x_1)|\lesssim \int _{I}\frac{|I|}{m_\lambda (I(x^1_0, |x_1-x^1_0|))|x_1-x^1_0|}|\alpha _R(z_1, x_2)|{dm_\lambda }(z_1), \end{aligned}$$

As a consequence, we obtain that

$$\begin{aligned} {\mathrm{F}}_1&\lesssim [m_\lambda (J)]^{1-\frac{p}{2}} [m_\lambda (I)]^{\frac{p}{2}}\sum _{k=0}^\infty \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\frac{|I|^p}{|x_1-x^1_0|^p}\frac{1}{m_\lambda (I(x^1_0, |x_1-x^1_0|))^p}\\&\quad \times \left[\int _{J}\int _{I}\left[\alpha _R(z_1, x_2)\right]^2{dm_\lambda }(z_1){\,dm_\lambda (x_2)}\right]^{\frac{p}{2}}{\,dm_\lambda (x_1)}\\&\lesssim [m_\lambda (J)]^{1-\frac{p}{2}} [m_\lambda (I)]^{\frac{p}{2}}\gamma ^{-p}\sum _{k=0}^\infty \frac{m_\lambda (2^{k+1}\gamma I)^{1-p}}{2^{kp}} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p\\&\lesssim [m_\lambda (J)]^{1-\frac{p}{2}} [m_\lambda (I)]^{\frac{p}{2}}\gamma ^{-p+(1-p)(2\lambda +1)}\Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p m_\lambda (I)^{1-p} \sum _{k=0}^\infty \frac{2^{(k+1)(2\lambda +1)(1-p)}}{2^{kp}}\\&\lesssim [\mu _\lambda (R)]^{1-\frac{p}{2}} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p {\gamma }^{-p+(1-p)(2\lambda +1)}, \end{aligned}$$

where the last inequality follows from the condition that \(p\in ({2\lambda +1\over 2\lambda +2},1]\).

For \({\mathrm{F}}_2\), let \(x^2_0\) be the center of J. By the cancellation of \(\alpha _R\) and the property \(({\mathrm{K_{ii}}})\) for \({W^{[\lambda ]}_{t_1}}(y_1, z_1)\) and \({W^{[\lambda ]}_{t_2}}(y_2, z_2)\), we also have

$$\begin{aligned}&{{\mathcal {N}}}_{h}\alpha _R(x_1, x_2)\\&\quad \le \!\sup _{\genfrac{}{}{0.0pt}{}{|y_1-x_1|<t_1}{|y_2-x_2|<t_2}}\! \int _{I}\!\int _{J}\left|W^{[\lambda ]}_{t_1^2}(y_1, z_1)-W^{[\lambda ]}_{t_1^2}(y_1, x^1_0)\right|\left|W^{[\lambda ]}_{t_2^2}(y_2, z_2)-W^{[\lambda ]}_{t_2^2}(y_2, x^2_0)\right|\\&\qquad |\alpha _R(z_1, z_2)|d\mu _\lambda (z_1,z_2)\\&\quad \lesssim \sup _{\genfrac{}{}{0.0pt}{}{|y_1-x_1|<t_1}{|y_2-x_2|<t_2}}\int _{I}\int _{J}\frac{1}{m_\lambda (I(x^1_0, |x^1_0-x_1|))}\frac{t_1|I|}{(|x^1_0-y_1|+t_1)^2}\\&\qquad \times \frac{1}{m_\lambda (I(x^2_0, |x^2_0-x_2|))}\frac{t_2|J|}{(|x^2_0-y_2|+t_2)^2}|\alpha _R(z_1, z_2)|d\mu _\lambda (z_1,z_2)\\&\lesssim \frac{|I|}{|x_1-x^1_0|}\frac{1}{m_\lambda (I(x^1_0, |x_1-x^1_0|))} \frac{|J|}{|x_2-x^2_0|}\frac{1}{m_\lambda (I(x^2_0, |x_2-x^2_0|))} [\mu _\lambda (R)]^{\frac{1}{2}}\Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}. \end{aligned}$$

Therefore,

$$\begin{aligned} {\mathrm{F}}_2&\lesssim [\mu _\lambda (R)]^{\frac{p}{2}}\Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p\\&\quad \sum _{k=0}^\infty \sum _{l=3}^\infty \int _{2^{k+1}{\gamma }I\setminus 2^k{\gamma }I}\int _{2^{l+1}J\setminus 2^lJ}\frac{|I|^p}{|x_1-x^1_0|^p}\frac{1}{m_\lambda (I(x^1_0, |x_1-x^1_0|))^p}\\&\quad \times \frac{|J|^p}{|x_2-x^2_0|^p}\frac{1}{m_\lambda (I(x^2_0, |x_2-x^2_0|))^p}{\,dm_\lambda (x_2)}{\,dm_\lambda (x_1)}\\&\lesssim [\mu _\lambda (R)]^{1-\frac{p}{2}} \Vert \alpha _R\Vert _{L^2({\mathbb {R}}_\lambda )}^p {\gamma }^{-p+(1-p)(2\lambda +1)}. \end{aligned}$$

Combining the estimates of \({\mathrm{F}}_1\) and \({\mathrm{F}}_2\), it is easy to see that we shall take \(\delta :=p-(1-p)(2\lambda +1)\), which is positive since \(p\in ({2\lambda +1\over 2\lambda +2},1]\). As a consequence, we obtain (4.40). We finish the proof of Theorem 1.4.

5 Proofs of Theorems 1.6 and 1.8

In this section, we present the proofs of Theorems 1.6 and 1.8. We first note that \({R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}\) is a product Calderón–Zygmund operator on space of homogeneous type \({\mathbb {R}}_\lambda \) (see the definition in Section 1 of [41]). And we consider \({R_{\Delta _\lambda ,\,1}}\) as \({R_{\Delta _\lambda ,\,1}}\otimes Id_2\) and \({R_{\Delta _\lambda ,\,2}}\) as \(Id_1\otimes {R_{\Delta _\lambda ,\,2}}\), where we use \(Id_1\) and \(Id_2\) to denote the identity operator on \(L^2({\mathbb {R}}_+,dm_\lambda )\). Then we can also understand \({R_{\Delta _\lambda ,\,1}}\) and \({R_{\Delta _\lambda ,\,2}}\) as product Calderón–Zygmund operators on \({\mathbb {R}}_\lambda \). We recall that the product Calderón–Zygmund operators T are bounded on \({L^r({\mathbb {R}}_\lambda )}\) for \(r\in (1, \infty )\), on \(H^p({\mathbb {R}}_\lambda )\) ( [41, Section 3.1]) and from \(H^p({\mathbb {R}}_\lambda )\) to \(L^p({\mathbb {R}}_\lambda )\) for \(p\in ((2\lambda +1)/(2\lambda +2),1]\) ( [41, Section 2.1.3]). Hence, for any \(f\in H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )\), we have

$$\begin{aligned}&\left\Vert{R_{\Delta _\lambda ,\,1}}\left(f\right)\right\Vert_{L^p({\mathbb {R}}_\lambda )}+\left\Vert{R_{\Delta _\lambda ,\,2}}\left(f\right)\right\Vert_{L^p({\mathbb {R}}_\lambda )}+\left\Vert{R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}\left(f\right)\right\Vert_{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}.\nonumber \\ \end{aligned}$$
(5.1)

Before we give the proof of Theorem 1.6, we first recall the following result [68, Lemma 11].

Lemma 5.1

Suppose that

  1. (i)

    u(tx) is continuous in \(t\in [0, \infty )\), \(x\in {{\mathbb {R}}}\) and even in x;

  2. (ii)

    In the region where \(u(t, x)>0\), u is of class \(C^2\) and satisfies \(\partial _t^2u+\partial _x^2u +2\lambda x^{-1}\partial _x u\ge 0\);

  3. (iii)

    \(u(0, x)=0;\)

  4. (iv)

    For some \(r\in [1, \infty )\), there exists a positive constant \(\widetilde{C}\) such that

    $$\begin{aligned} \sup _{0<t<\infty }\displaystyle \int _0^\infty |u(t, x)|^r\,dm_\lambda (x)\le \widetilde{C}<\infty . \end{aligned}$$

Then \(u(t, x)\le 0\).

For \(f\in {L^p({\mathbb {R}}_\lambda )}\) with \(p\in [1, \infty )\), and \(t_1,\,t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\), let

$$\begin{aligned} u(t_1,\,t_2,\,x_1,\,x_2)&:={P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f(x_1,x_2),\nonumber \\ v(t_1,\,t_2,\,x_1,\,x_2)&:={Q^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f(x_1,x_2), \end{aligned}$$
(5.2)

and

$$\begin{aligned} w(t_1,\,t_2,\,x_1,\,x_2)&:={P^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}f(x_1,x_2),\nonumber \\z(t_1,\,t_2,\,x_1,\,x_2)&:={Q^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}f(x_1,x_2), \end{aligned}$$
(5.3)

where \({Q^{[\lambda ]}_{t_1}}\) and \({Q^{[\lambda ]}_{t_2}}\) are defined as in (2.3). Moreover, define

$$\begin{aligned} u^*(x_1, x_2):={{\mathcal {R}}}_{P}f(x_1, x_2) \end{aligned}$$
(5.4)

and

$$\begin{aligned} F(t_1,\,t_2,\,x_1,\,x_2)&:=\left\rbrace [u(t_1,\,t_2,\,x_1,\,x_2)]^2 +[v(t_1,\,t_2,\,x_1,\,x_2)]^2\right. \nonumber \\&\quad +\left.[w(t_1,\,t_2,\,x_1,\,x_2)]^2+[z(t_1,\,t_2,\,x_1,\,x_2)]^2\right\lbrace ^{\frac{1}{2}}. \end{aligned}$$
(5.5)

We first establish the following lemma.

Lemma 5.2

Let \(f\in {H^1_{Riesz}({\mathbb {R}}_\lambda )}\), u, v, w, z and F be, respectively, as in (5.2), (5.3) and (5.5). Then there exists a positive constant C independent of f, u, v, w, z and F, such that

$$\begin{aligned}&{\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}F(t_1,\,t_2,\,x_1,\,x_2){\,d\mu _\lambda (x_1,x_2)}\le C \Vert f\Vert _{H^1_{Riesz}({\mathbb {R}}_\lambda )}. \end{aligned}$$

Proof

It suffices to show that

$$\begin{aligned}&{\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|u(t_1,\,t_2,\,x_1,\,x_2)|{\,d\mu _\lambda (x_1,x_2)}\le \Vert f\Vert _{L^1({\mathbb {R}}_\lambda )}, \end{aligned}$$
(5.6)
$$\begin{aligned}&{\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|v(t_1,\,t_2,\,x_1,\,x_2)|{\,d\mu _\lambda (x_1,x_2)}\lesssim \Vert {R_{\Delta _\lambda ,\,1}}f\Vert _{L^1({\mathbb {R}}_\lambda )}, \end{aligned}$$
(5.7)
$$\begin{aligned}&{\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|w(t_1,\,t_2,\,x_1,\,x_2)|{\,d\mu _\lambda (x_1,x_2)}\lesssim \Vert {R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}, \end{aligned}$$
(5.8)

and

$$\begin{aligned} {\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|z(t_1,\,t_2,\,x_1,\,x_2)|{\,d\mu _\lambda (x_1,x_2)}\lesssim \Vert {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}. \end{aligned}$$
(5.9)

To this end, we first note that (5.6) follows from \({\mathrm{(S_{i}})}\) in Lemma 2.8. Moreover, by the fact that for any \(t,\,y\in {\mathbb {R}}_+\),

$$\begin{aligned} {\int _0^\infty }xy P^{[\lambda +1]}_t(x, y){dm_\lambda }(x)\,\lesssim 1, \end{aligned}$$
(5.10)

we obtain that for every \(f\in {L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\) and \(t\in {\mathbb {R}}_+\),

$$\begin{aligned} \left\Vert{Q^{[\lambda ]}_t}f\right\Vert_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}= & {} \left\Vert{\int _0^\infty }\cdot yP^{[\lambda +1]}_t(\cdot , y){R_{\Delta _\lambda }}(f)(y)\,{dm_\lambda }(y)\right\Vert_{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\nonumber \\\lesssim & {} \Vert {R_{\Delta _\lambda }}f\Vert _{L^1({{\mathbb {R}}}_+,\, dm_\lambda )}; \end{aligned}$$
(5.11)

see [5, p.  208]. Therefore, by the uniform \({L^1({{\mathbb {R}}}_+,\, dm_\lambda )}\)-boundedness of \(\{{P^{[\lambda ]}_t}\}_{t>0}\)(Lemma 2.8\({\mathrm{(S_i)}}\)), we see that for any \(t_1\), \(t_2\in {\mathbb {R}}_+\),

$$\begin{aligned} {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}|v(t_1,\,t_2,\,x_1,\,x_2)|{\,d\mu _\lambda (x_1,x_2)}\le & {} {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left|{Q^{[\lambda ]}_{t_1}}f(x_1,\,x_2)\right|{\,d\mu _\lambda (x_1,x_2)}\\\lesssim & {} \Vert {R_{\Delta _\lambda ,\,1}}f\Vert _{L^1({\mathbb {R}}_\lambda )}. \end{aligned}$$

This implies (5.7). Similarly, we have (5.8).

Finally, from (5.11), we deduce that

$$\begin{aligned} z(t_1,\,t_2,\,x_1,\,x_2)= & {} {\int _0^\infty }{\int _0^\infty }x_1y_1P^{[\lambda +1]}_{t_1}(x_1, y_1)x_2y_2 P^{[\lambda +1]}_{t_2}(x_2, y_2)\\&\times {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}f(y_1, y_2)\,d\mu _\lambda (y_1,y_2). \end{aligned}$$

By this and (5.10), we show (5.9) immediately. This finishes the proof of Lemma 5.2. \(\square \)

The following lemma was established in [68, Lemma 5], see also [5, p. 206].

Lemma 5.3

Let \(\lambda \in [0, \infty )\), \(p\in [2\lambda /(2\lambda +1), \infty )\), \(\triangle _{t,\,x}\) be as in (1.1) and \(F:=(u, v)\) with u and v satisfying the equation (1.2). If \(|F|>0\), then

$$\begin{aligned} \triangle _{t,\,x}|F|^p\le 0. \end{aligned}$$

Proof of Theorem 1.6

We first show that for any \(f\in H^1_{\Delta _\lambda }({\mathbb {R}}_\lambda )\),

$$\begin{aligned} \Vert f\Vert _{H^1_{Riesz}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^1_{\Delta _\lambda }({\mathbb {R}}_\lambda )}. \end{aligned}$$

To see this, based on Definition 1.5, it suffices to prove that

$$\begin{aligned}&\Vert f\Vert _{L^1({\mathbb {R}}_\lambda )}+\Vert {R_{\Delta _\lambda ,\,1}}f\Vert _{L^1({\mathbb {R}}_\lambda )}+ \Vert {R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}+ \Vert {R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}f\Vert _{L^1({\mathbb {R}}_\lambda )}\\&\quad \lesssim \Vert f\Vert _{H^1_{\Delta _\lambda }({\mathbb {R}}_\lambda )}, \end{aligned}$$

which follows from (5.1) with \(p:=1\) and the fact that \(H^1_{\Delta _\lambda }({\mathbb {R}}_\lambda )\) is a subspace of \(L^1({\mathbb {R}}_\lambda )\).

Conversely, assume that \(f\in H^1_{Riesz}({\mathbb {R}}_\lambda )\). By Theorem 1.4, it suffices to show that

$$\begin{aligned} \Vert f\Vert _{H^1_{{\mathcal {R}}_{P}}({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^1_{Riesz}({\mathbb {R}}_\lambda )}. \end{aligned}$$

To this end, based on Lemma 5.2, it remains to prove that

$$\begin{aligned} \Vert f\Vert _{H^1_{{\mathcal {R}}_{P}}({\mathbb {R}}_\lambda )}=\Vert u^*\Vert _{L^1({\mathbb {R}}_\lambda )}\lesssim {\displaystyle {\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}} F(t_1,\,t_2,\,x_1,\,x_2)\,{\,dm_\lambda (x_1)}{\,dm_\lambda (x_2)},\nonumber \\ \end{aligned}$$
(5.12)

where \(u^*\) and F are as in (5.4) and (5.5). We first claim that we only need to show that for \(p\in \big (\frac{2\lambda +1}{2\lambda +2}, 1\big )\) and \(\epsilon _1,\,t_1,\,\epsilon _2,\, t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^p(\epsilon _1+t_1,\,\epsilon _2+ t_2,\,x_1,\,x_2)\lesssim {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}\left(F^p(\epsilon _1,\,\epsilon _2,\,\cdot ,\,\cdot )\right)(x_1,x_2). \end{aligned}$$
(5.13)

Indeed, by Lemma 5.2, we see that \(F\in {L^1({\mathbb {R}}_\lambda )}\). If (5.13) holds, then the uniform \({L^r({{\mathbb {R}}}_+,\, dm_\lambda )}\)-boundedness of \(\{{P^{[\lambda ]}_t}\}_{t>0}\), with \(r:=1/p\), implies that \(\{F^p(\epsilon _1,\,\epsilon _2,\,\cdot ,\,\cdot )\}_{\epsilon _1,\,\epsilon _2>0}\) is bounded in \(L^r({\mathbb {R}}_\lambda )\). Since \({L^r({\mathbb {R}}_\lambda )}\) is reflexive, there exist two sequences \(\{\epsilon _{1,\,k}\}\), \(\{\epsilon _{2,\,j}\}\downarrow 0\) and \(h\in L^r({\mathbb {R}}_\lambda )\) such that \(\{F^p(\epsilon _{1,\,k},\,\epsilon _{2,\,j},\,\cdot ,\,\cdot )\}_{\epsilon _{1,\,k},\,\epsilon _{2,\,j}>0}\) converges weakly to h in \(L^r({\mathbb {R}}_\lambda )\) as \(k,\,j\rightarrow \infty \). Moreover, by Hölder’s inequality, we see that

$$\begin{aligned} \Vert h\Vert _{L^r({\mathbb {R}}_\lambda )}^r&=\bigg \{\displaystyle \sup _{\Vert g\Vert _{L^{r'}({\mathbb {R}}_\lambda )}\le 1}\left|{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}g(x_1,x_2)h(x_1,x_2)\,d\mu _\lambda (x_1,x_2)\right|\bigg \}^r \nonumber \\&=\bigg \{\displaystyle \sup _{\Vert g\Vert _{L^{r'}({\mathbb {R}}_\lambda )}\le 1}\lim _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }} \left|{\displaystyle {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}} g(x_1, x_2)F^p(\epsilon _{1,\,k},\,\epsilon _{2,\,j},\,\cdot ,\,\cdot )\,d\mu _\lambda (x_1,x_2)\right|\bigg \}^r \nonumber \\&\le \displaystyle \limsup _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }}\big \Vert F^p(\epsilon _{1,\,k},\,\epsilon _{2,\,j},\,\cdot ,\,\cdot )\big \Vert _{L^r({\mathbb {R}}_\lambda )}^r \nonumber \\&\le \sup _{t_1>0,\,t_2>0}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}F(t_1,\,t_2,\,x_1,\,x_2)\,d\mu _\lambda (x_1,x_2). \end{aligned}$$
(5.14)

Since F is continuous in \(t_1\) and \(t_2\), for any \(x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^p(t_1+\epsilon _{1,\,k}, t_2+\epsilon _{2,\,j}, x_1, x_2)\rightarrow F^p(t_1, t_2, x_1, x_2) \end{aligned}$$

as \(k,\,j\rightarrow \infty \). Observe that for each \(x_1,\,x_2\in (0, \infty )\),

$$\begin{aligned} {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}(F^p(\epsilon _{1,\,k},\,\epsilon _{2,\,j},\,\cdot ,\,\cdot ))(x_1, x_2)\rightarrow {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}(h)(x_1, x_2) \end{aligned}$$

as \(k,\,j\rightarrow \infty \). Thus, by these facts and (5.13), we have that for any \(t_1,\,t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^p(t_1, t_2, x_1, x_2)= & {} \displaystyle \lim _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }}F^p(t_1+\epsilon _{1,\,k}, t_2+\epsilon _{2,\,j}, x_1, x_2)\\\lesssim & {} \displaystyle \lim _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }}{P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}(F^p(\epsilon _{1,\,k}, \,\epsilon _{2,\,j},\,\cdot ,\,\cdot ))(x_1,x_2)\\= & {} {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}(h)(x_1, x_2). \end{aligned}$$

Therefore,

$$\begin{aligned} {[}u^*(x_1, x_2)]^p\le \sup _{t_1>0,\,t_2>0}F^p(t_1,t_2, x_1, x_2)\lesssim {\mathcal {R}}_P(h)(x_1, x_2). \end{aligned}$$

By this together with \(r:=1/p\), the \({L^r({\mathbb {R}}_\lambda )}\)-boundedness of \({\mathcal {R}}_P\) and (5.14), we then have

$$\begin{aligned} \Vert u^*\Vert _{L^1({\mathbb {R}}_\lambda )}&\lesssim \Vert {\mathcal {R}}_P(h)\Vert ^r_{L^r({\mathbb {R}}_\lambda )} \lesssim {\sup _{t_1>0,\,t_2>0}{\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}} F(t_1,\,t_2,\,x_1,\,x_2)\,d\mu _\lambda (x_1,x_2), \end{aligned}$$

which implies that (5.12). Thus the claim holds.

Now we prove (5.13). Observe that for any fixed \(t_2,\,x_2\in {\mathbb {R}}_+\), \(u,\,v\) and \(w,\,z\) respectively satisfy the Cauchy-Riemann equations for \(t_1\) and \(x_1\), and for any fixed \(t_1, \,x_1\in {\mathbb {R}}_+\), \(u,\,w\) and \(v,\,z\) respectively satisfy the Cauchy-Riemann equations for \(t_2\) and \(x_2\). That is,

$$\begin{aligned} \displaystyle \left\{ \begin{array}{ll} {\partial }_{x_1} u+{\partial }_{t_1} v=0, \\ {\partial }_{t_1} u-{\partial }_{x_1} v=\frac{2\lambda }{x_1}v; \end{array} \right. \quad \left\{ \begin{array}{ll} {\partial }_{x_1} w+{\partial }_{t_1} z=0, \\ {\partial }_{t_1} w-{\partial }_{x_1} z=\frac{2\lambda }{x_1}z; \end{array} \right. \end{aligned}$$
(5.15)

and

$$\begin{aligned} \left\{ \begin{array}{ll} {\partial }_{x_2} u+{\partial }_{t_2} w=0, \\ {\partial }_{t_2} u-{\partial }_{x_2} w=\frac{2\lambda }{x_2}w; \end{array} \right. \quad \left\{ \begin{array}{ll} {\partial }_{x_2} v+{\partial }_{t_2} z=0, \\ {\partial }_{t_2} v-{\partial }_{x_2} z=\frac{2\lambda }{x_2}z. \end{array} \right. \end{aligned}$$
(5.16)

For fixed \(t_2,\, x_2\in {\mathbb {R}}_+\), let

$$\begin{aligned} F_1(t_1,\,t_2,\,x_1,\,x_2):=\left\rbrace [u(t_1,\,t_2,\,x_1,\,x_2)]^2 +[v(t_1,\,t_2,\,x_1,\,x_2)]^2\right\lbrace ^{\frac{1}{2}}, \end{aligned}$$

where \(t_1,\,x_1\in {\mathbb {R}}_+\). For the moment, we fix \(t_2\), \(x_2\) and regard \(F_1\) as a function of \(t_1\) and \(x_1\). By (5.15), for \(p\in (\frac{2\lambda +1}{2\lambda +2}, 1)\),

$$\begin{aligned} {\partial }_{t_1}^2 F^p_1(t_1,\,t_2,\,x_1,\,x_2)+{\partial }_{x_1}^2 F^p_1(t_1,\,t_2,\,x_1,\,x_2) +\frac{2\lambda }{x_1}{\partial }_{x_1} F^p_1(t_1,\,t_2,\,x_1,\,x_2)\ge 0;\nonumber \\ \end{aligned}$$
(5.17)

see Lemma 5.3. We also observe that for any \(\epsilon _1,\,t_2,\,x_1, x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} \lim _{t_1\rightarrow 0}P^{[\lambda ]}_{t_1}(F^p_1(\epsilon _1,\,t_2,\,\cdot ,\,x_2))(x_1)=F^p_1(\epsilon _1,\,t_2,\,x_1,\,x_2), \end{aligned}$$
(5.18)

and by Lemma 5.2, for all \(t_2\in {\mathbb {R}}_+\) and almost \(x_2\in {\mathbb {R}}_+\),

$$\begin{aligned}&\sup _{t_1>0}{\int _0^\infty }\left[F^p_1(t_1,\,t_2,\,x_1,\,x_2)\right]^r{\,dm_\lambda (x_1)}\nonumber \\&\quad \le \sup _{t_1>0}{\int _0^\infty }F(t_1,\,t_2,\,x_1,\,x_2){\,dm_\lambda (x_1)}<\infty . \end{aligned}$$
(5.19)

Now we claim that for any \(\epsilon _1, t_1,\,t_2,\,x_1, x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^p_1(\epsilon _1+t_1,\,t_2,\,x_1,\,x_2)\le {P^{[\lambda ]}_{t_1}}\left(F^p_1(\epsilon _1,\,t_2,\,\cdot ,\, x_2)\right)(x_1). \end{aligned}$$
(5.20)

Indeed, as in [5], for any \(\epsilon _1,\, t_1,\,t_2,\,x_1, x_2\in {\mathbb {R}}_+\), let

$$\begin{aligned} \widetilde{F}_{1,\,t_2,\,x_2}(t_1,\,x_1):=\left\rbrace [\widetilde{u}(t_1,\,t_2,\,x_1,\,x_2)]^2+[\widetilde{v}(t_1,\,t_2,\,x_1,\,x_2)]^2\right\lbrace ^{\frac{1}{2}} \end{aligned}$$

and

$$\begin{aligned} U_{\epsilon _1,\,t_2,\,x_2}(t_1,\,x_1):=\widetilde{F}_{1,\,t_2,\,x_2}^p(\epsilon _1+t_1,\,x_1)-\widetilde{P^{[\lambda ]}_{t_1}}[\widetilde{F}_{1,\,t_2,\,x_2}^p(\epsilon _1,\,\cdot )](x_1), \end{aligned}$$

where for fixed \(t_2\) and \(x_2\), \(\widetilde{P^{[\lambda ]}_{t_1}}(\widetilde{F}_{1,\,t_2,\,x_2}^p), \widetilde{u}\) are even extensions of \({P^{[\lambda ]}_{t_1}}(\widetilde{F}_{1,\,t_2,\,x_2}^p)\) and u, and \(\widetilde{v}\) is the odd extension of v with respect to \(x_1\) to \({\mathbb {R}}_+\times {{\mathbb {R}}}\), respectively. By (1.1), (5.17), (5.18) and (5.19), it is not difficult to check that \(U_{\epsilon _1,\,t_2,\,x_2}\) satisfies (i)-(iv) of Lemma 5.1. Then an application of Lemma 5.1 shows that \(U_{\epsilon _1,\,t_2,\,x_2}(t_1,x_1)\le 0\). Thus, (5.20) holds.

Similarly, let

$$\begin{aligned} F_2(t_1,\,t_2,\,x_1,\,x_2):=\left\rbrace [w(t_1,\,t_2,\,x_1,\,x_2)]^2 +[z(t_1,\,t_2,\,x_1,\,x_2)]^2\right\lbrace ^{\frac{1}{2}}. \end{aligned}$$

Since (5.17), (5.18) and (5.19) all hold with \(F_1\) replaced with \(F_2\), we have that for any \(\epsilon _1, t_1,\,t_2\), \(x_1\), \(x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^p_2(\epsilon _1+t_1,\,t_2,\,x_1,\,x_2)\le {P^{[\lambda ]}_{t_1}}\left(F^p_2(\epsilon _1,\,t_2,\,\cdot ,\, x_2)\right)(x_1). \end{aligned}$$
(5.21)

Observe that for any \(t_1,\,t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F(t_1,\,t_2,\,x_1,\,x_2)\sim \sum _{i=1}^2F_i(t_1,\,t_2,\,x_1,\,x_2). \end{aligned}$$

By this fact, (5.20), (5.21) and Lemma 2.8\({\mathrm{(S_{ii}})}\), we have that

$$\begin{aligned} F^p(\epsilon _1+t_1,\,t_2,\,x_1,\,x_2)\lesssim {P^{[\lambda ]}_{t_1}}\left(F^p(\epsilon _1,\,t_2,\,\cdot ,\,\cdot )\right)(x_1,x_2). \end{aligned}$$
(5.22)

Moreover, from (5.16), Lemma 5.2 and Lemma 5.1, we also deduce that

$$\begin{aligned} F^p(t_1,\,\epsilon _2+ t_2,\,x_1,\,x_2)\lesssim {P^{[\lambda ]}_{t_2}}\left(F^p(t_1,\,\epsilon _2,\,\cdot ,\,\cdot )\right)(x_1,x_2). \end{aligned}$$

Now by this and (5.22), we conclude that

$$\begin{aligned} F^p(\epsilon _1+t_1,\,\epsilon _2+ t_2,\,x_1,\,x_2)\lesssim & {} {P^{[\lambda ]}_{t_1}}\left(F^p(\epsilon _1,\,\epsilon _2+ t_2,\,\cdot ,\,\cdot )\right)(x_1,x_2)\\\lesssim & {} {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}\left(F^p(\epsilon _1,\,\epsilon _2,\,\cdot ,\,\cdot )\right)(x_1,x_2). \end{aligned}$$

This implies (5.13), and hence finishes the proof of Theorem 1.6. \(\square \)

We next present the proof of Theorem 1.8.

Proof of Theorem 1.8

We first assume that \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\). By Theorem 1.4, we see that for any \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\), \({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\in {L^p({\mathbb {R}}_\lambda )}\) with

$$\begin{aligned} \left\Vert{P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\right\Vert_{L^p({\mathbb {R}}_\lambda )}\le \Vert {{\mathcal {R}}}_{P} f\Vert _{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}, \end{aligned}$$

where the implicit constant is independent of \(t_1, t_2\) and f. Moreover, by the semigroup property of \(\left\{ {P^{[\lambda ]}_t}\right\} _{t>0}\) contained in Lemma 2.8 and Theorem 1.4, we obtain that

$$\begin{aligned} \left\Vert{P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\right\Vert_{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\sim \left\Vert{{\mathcal {R}}}_{P}\left({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\right)\right\Vert_{L^p({\mathbb {R}}_\lambda )}=\left\Vert{{\mathcal {R}}}_{P}\left(f\right)\right\Vert_{L^p({\mathbb {R}}_\lambda )}\lesssim \Vert f\Vert _{H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}. \end{aligned}$$

This implies \({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\) with the norm independent of \(t_1\) and \(t_2\). Then an application of (5.1) shows that (1.6) holds.

Conversely, we assume that f satisfies (1.6) and show \(f\in {H^p_{\Delta _\lambda }({\mathbb {R}}_\lambda )}\). The proof is similar to that of Theorem 1.6. Precisely, we first claim that

$$\begin{aligned} \sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}|F(t_1, t_2, x_1, x_2)|^p\,{\,d\mu _\lambda (x_1,x_2)}\lesssim 1, \end{aligned}$$
(5.23)

where \(F(t_1, t_2, x_1, x_2)\) is as in (5.5). From Proposition 2.10, we see that the definition of F makes sense. Indeed, for \(\delta _1,\,\delta _2,\,t_1,\,t_2,\, x_1,\,x_2\in {\mathbb {R}}_+\), let

$$\begin{aligned}&u(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2):={P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2),\\&v(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2):={Q^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2),\\&w(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2):={P^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2), \end{aligned}$$

and

$$\begin{aligned} z(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2):={Q^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2). \end{aligned}$$

Moreover, as functions of variables \((\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\), we define

$$\begin{aligned} G_1:=\left\rbrace u^2+v^2\right\lbrace ^{\frac{1}{2}},\ G_2:=\left\rbrace w^2+z^2\right\lbrace ^{\frac{1}{2}},\ G_3:=\left\rbrace u^2+w^2\right\lbrace ^{\frac{1}{2}},\ G_4:=\left\rbrace v^2+z^2\right\lbrace ^{\frac{1}{2}} \end{aligned}$$

and

$$\begin{aligned} G:=\{u^2+v^2+w^2+z^2\}^{\frac{1}{2}}. \end{aligned}$$

We now prove that for all \(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\), when \(i:=1,\,2,\)

$$\begin{aligned} G^p_i(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\le {P^{[\lambda ]}_{t_1}}\left(G^p_i(\delta _1,\,\delta _2,\,0,\,t_2,\cdot ,\, x_2)\right)(x_1), \end{aligned}$$
(5.24)

and when \(i:=3,\,4\),

$$\begin{aligned} G^p_i(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\le {P^{[\lambda ]}_{t_2}}\left(G^p_i(\delta _1,\,\delta _2,\,t_1,\,0,\, x_1,\,\cdot )\right)(x_2). \end{aligned}$$
(5.25)

Here we mention that for any function \(g\in {L^r({{\mathbb {R}}}_+,\, dm_\lambda )}\) with \(r\in [1, \infty )\),

$$\begin{aligned} P^{[\lambda ]}_0g:=\lim _{t\rightarrow 0}{P^{[\lambda ]}_t}g= g\quad {\mathrm{and}}\quad Q^{[\lambda ]}_0g:=\lim _{t\rightarrow 0}Q^{[\lambda ]}_tg={R_{\Delta _\lambda }}g; \end{aligned}$$

see [5, 6]. Since the assumption that f is restricted at infinity implies that \({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\in {L^s({\mathbb {R}}_\lambda )}\) for all \(t_1, t_2\in {\mathbb {R}}_+\) and \(s\in [p, \infty ]\), \(G_i(\delta _1,\,\delta _2,\,t_1,\,0,\, x_1,\,x_2)\) and \(G_i(\delta _1,\,\delta _2,\,0,\,t_2,\, x_1,\,x_2)\) make sense for \(i\in \{1,2,3,4\}\).

By similarity, we only show that (5.24) holds for \(G_2\). We fix \(\delta _1,\delta _2,t_2,x_2\in {\mathbb {R}}_+\) and regard \(G_2\) as a function of \(t_1\) and \(x_1\) for the moment and the argument is analogous to that for (5.20). Indeed, let

$$\begin{aligned} V_{\delta _1,\,\delta _2,\,t_2,\,x_2}(t_1,\,x_1) :=\widetilde{G}^p_{2,\,\delta _1,\,\delta _2,\,t_2,\,x_2}(t_1,\,x_1)- \widetilde{P^{[\lambda ]}_{t_1}}\left(\widetilde{G}^p_{2,\,\delta _1,\,\delta _2,\,t_2,\,x_2}(0,\,\cdot )\right)(x_1), \end{aligned}$$

where \(\widetilde{G}^p_{2,\,\delta _1,\,\delta _2,\,t_2,\,x_2}(t_1,\,x_1):=[\widetilde{w}^2+\widetilde{z}^2]^\frac{1}{2}\) and \(\widetilde{w},\,\widetilde{z}\) are even and odd extensions of w and z with respect to \(x_1\) to \({\mathbb {R}}\), respectively, and \(\widetilde{P^{[\lambda ]}_{t_1}}(\widetilde{G}^p_{2,\,\delta _1,\,\delta _2,\,t_2,\,x_2}(0,\,\cdot ))(x_1)\) is the even extension of \({P^{[\lambda ]}_{t_1}}(\widetilde{G}^p_{2,\,\delta _1,\,\delta _2,\,t_2,\,x_2}(0,\,\cdot ))(x_1)\) to \({\mathbb {R}}\) with respect to \(x_1\).

We now show that \(V_{\delta _1,\,\delta _2,\,t_2,\,x_2}\) satisfies (i)-(iv) of Lemma 5.1. In fact, since \(V_{\delta _1,\,\delta _2,\,t_2,\,x_2}\) is an even function with respect to \(x_1\), we only need to consider \(x_1\in {\mathbb {R}}_+\). Since the assumption that f is restricted at infinity implies that \({P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}f\in {L^s({\mathbb {R}}_\lambda )}\) for all \(t_1, t_2\in {\mathbb {R}}_+\) and \(s\in [p, \infty ]\), by Lemma 2.8\({\mathrm{(S_i)}}\), the uniform boundedness of \({Q^{[\lambda ]}_t}\) on \({L^2({{\mathbb {R}}}_+,\, dm_\lambda )}\) (see [68, p. 87]), we further obtain that for fixed \(\delta _1,\,\delta _2,\, t_1,\, t_2\in {\mathbb {R}}_+\),

$$\begin{aligned}&\displaystyle {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}G^2_2(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\,d\mu _\lambda (x_1,x_2)\nonumber \\&\quad = \displaystyle {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left[\left|{P^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2\right.\nonumber \\&\qquad \left.+\left|{Q^{[\lambda ]}_{t_1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2\right]\,d\mu _\lambda (x_1,x_2)\nonumber \\&\quad \lesssim \displaystyle {\iint _{{\mathbb {R}}_+\times {\mathbb {R}}_+}}\left[{P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f(x_1, x_2)\right]^2\,d\mu _\lambda (x_1,x_2)<\infty . \end{aligned}$$
(5.26)

Observe that for all \(\delta _1,\,\delta _2,\,t_2\) and almost every \(x_1,\,x_2\in (0, \infty )\),

$$\begin{aligned} {[}G_2(\delta _1,\,\delta _2,\,0,\,t_2,\,x_1,\,x_2)]^2= & {} \left|{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2\\&+\left|{R_{\Delta _\lambda ,\,1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2. \end{aligned}$$

This fact together with Lemma 2.8\({\mathrm{(S_i)}}\), the uniform boundedness of \({Q^{[\lambda ]}_{t_2}}\) on \(L^2({\mathbb {R}}_+, {\,dm_\lambda (x_2)})\) and the boundedness of \({R_{\Delta _\lambda ,\,1}}\) on \(L^2({\mathbb {R}}_+, {\,dm_\lambda (x_1)})\) implies that

$$\begin{aligned}&\displaystyle \int _0^\infty \displaystyle \int _0^\infty \left[{P^{[\lambda ]}_{t_1}}(G^p_2(\delta _1,\,\delta _2,\,0,\,t_2,\,\cdot ,\,x_2))(x_1)\right]^{2/p}\,d\mu _\lambda (x_1,x_2)\\&\quad \lesssim \displaystyle \int _0^\infty \displaystyle \int _0^\infty \left|G_2(\delta _1,\,\delta _2,\,0,\,t_2,\,x_1,\,x_2)\right|^2\,d\mu _\lambda (x_1,x_2)\\&\quad \lesssim \displaystyle \int _0^\infty \displaystyle \int _0^\infty \left[\left|{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2\right.\\&\qquad \left. +\left|{R_{\Delta _\lambda ,\,1}}{Q^{[\lambda ]}_{t_2}}\left({P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f\right)(x_1,x_2)\right|^2\right]\,d\mu _\lambda (x_1,x_2)\\&\quad \lesssim \displaystyle \int _0^\infty \displaystyle \int _0^\infty \left|{P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f(x_1,x_2)\right|^2\,d\mu _\lambda (x_1,x_2), \end{aligned}$$

which, together with (5.26), yields that for all \(\delta _1,\,\delta _2,\,t_2\) and almost all \(x_2\),

$$\begin{aligned}&\displaystyle \sup _{0<t_1<\infty }\displaystyle \int _0^\infty |V_{\delta _1,\,\delta _2,\,t_2,\,x_2}(t_1,\,x_1)|^{2/p}\,dm_\lambda (x_1)\\&\quad \lesssim \displaystyle \int _0^\infty \left|{P^{[\lambda ]}_{\delta _1}}{P^{[\lambda ]}_{\delta _2}}f(x_1,x_2)\right|^2\,dm_\lambda (x_1)<\infty . \end{aligned}$$

Therefore, for fixed \(\delta _1,\,\delta _2,\,t_2,\,x_2\), \(V_{\delta _1,\,\delta _2,\,t_2,\,x_2}(t_1, x_1)\) satisfies the assumptions of Lemma 5.1 and hence (5.24) for \(G_2\) follows from Lemma 5.1 immediately.

By an argument involving (5.24) and (5.25), we further see that for almost all \(x_1\) and \(x_2\),

$$\begin{aligned} G^p(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\lesssim {P^{[\lambda ]}_{t_2}}{P^{[\lambda ]}_{t_1}}\left(G^p(\delta _1,\,\delta _2,\,0,\,0,\,\cdot ,\,\cdot )\right)(x_1, x_2). \end{aligned}$$

Moreover, observe that

$$\begin{aligned}&G(\delta _1,\,\delta _2,\,0,\,0,\,x_1,\,x_2)\\&\quad \sim \left|P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2}(f)(x_1, x_2)\right|+\left|{R_{\Delta _\lambda ,\,1}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|\\&\qquad +\left|{R_{\Delta _\lambda ,\,2}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|+\left|{R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|. \end{aligned}$$

Using these facts, (1.6) and Lemma 2.8\({\mathrm{(S_i)}}\), we see that

$$\begin{aligned}&{\int _0^\infty \int _0^\infty }[G(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)]^p\,{\,d\mu _\lambda (x_1,x_2)}\nonumber \\&\quad \lesssim {\int _0^\infty \int _0^\infty }{P^{[\lambda ]}_{t_2}}{P^{[\lambda ]}_{t_1}}([G(\delta _1,\,\delta _2,\,0,\,0,\,\cdot ,\,\cdot )]^p)(x_1,x_2)\,{\,d\mu _\lambda (x_1,x_2)}\nonumber \\&\quad \lesssim {\int _0^\infty \int _0^\infty }|G(\delta _1,\,\delta _2,\,0,\,0,\,x_1,\,x_2)|^p\,{\,d\mu _\lambda (x_1,x_2)}\nonumber \\&\quad \sim \displaystyle \int _0^\infty \displaystyle \int _0^\infty \left[\left|P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2}(f)(x_1, x_2)\right|+\left|{R_{\Delta _\lambda ,\,1}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|\right.\nonumber \\&\qquad +\left|{R_{\Delta _\lambda ,\,1}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|\nonumber \\&\qquad \left.+\left|{R_{\Delta _\lambda ,\,1}}{R_{\Delta _\lambda ,\,2}}\left(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\right)(x_1, x_2)\right|\right]^p\,{\,d\mu _\lambda (x_1,x_2)}\nonumber \\&\quad \lesssim 1, \end{aligned}$$
(5.27)

where the implicit constant is independent of \(t_1\), \(t_2\), \(\delta _1\) and \(\delta _2\).

Observe that for each \(t_1,\,t_2\), \(x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} G(\delta _1,\,\delta _2,\,t_1,\,t_2,\,x_1,\,x_2)\rightarrow F(t_1, t_2, x_1, x_2)\quad {\mathrm{as}}\quad \delta _1,\,\delta _2\rightarrow 0. \end{aligned}$$

Indeed, observe that \(P^{[\lambda ]}_{\delta _1}P^{[\lambda ]}_{\delta _2} f\rightarrow f\) in \({\mathcal {G}}(1,1;1,1)'\) as \(\delta _1,\,\delta _2\rightarrow 0\). By these facts together with (5.27) and Fatou’s lemma, we further have (5.23).

Let \(q\in ({2\lambda +1\over 2\lambda +2}, p)\) and \(r:=p/q\). As in the proof of (5.13), we see that

$$\begin{aligned} F^q(\delta _1+t_1,\,\delta _2+ t_2,\,x_1,\,x_2)\lesssim {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}\left(F^q(\delta _1,\,\delta _2,\,\cdot ,\,\cdot )\right)(x_1,x_2). \end{aligned}$$
(5.28)

By (5.23), there exists a subsequence \(\{F^q(\delta _{1,\,k},\,\delta _{2,\,j},\,\cdot ,\,\cdot )\}_{\delta _{1,\,k};\,\delta _{2,\,j}>0}\) of \(\{F^q(\delta _1,\,\delta _2,\,\cdot ,\,\cdot )\}_{\delta _1;\,\delta _2>0} \) and \(h\in L^r({\mathbb {R}}_\lambda )\) such that \(\{F^q(\delta _{1,\,k},\,\delta _{2,\,j},\,\cdot ,\,\cdot )\}_{\delta _{1,\,k};\,\delta _{2,\,j}>0}\) converges weakly to h in \(L^r({\mathbb {R}}_\lambda )\) as \(k,\,j\rightarrow \infty \), which further implies that

$$\begin{aligned} \Vert h\Vert ^r_{L^r({\mathbb {R}}_\lambda )}&=\bigg \{\displaystyle \sup _{\Vert g\Vert _{L^{r'}({\mathbb {R}}_\lambda )}\le 1}\lim _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }} \bigg |\displaystyle \int _0^\infty \displaystyle \int _0^\infty g(x_1,x_2)F^q(\delta _{1,\,k},\,\delta _{2,\,j},\,x_1,x_2)\,\,d\mu _\lambda (x_1,x_2)\bigg |\bigg \}^r \nonumber \\&\le \displaystyle \limsup _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }}\big \Vert F^q(\delta _{1,\,k},\,\delta _{2,\,j},\,\cdot ,\,\cdot )\big \Vert _{L^r({\mathbb {R}}_\lambda )}^r\nonumber \\&\le \displaystyle \limsup _{\genfrac{}{}{0.0pt}{}{k\rightarrow \infty }{j\rightarrow \infty }}\big \Vert F(\delta _{1,\,k},\,\delta _{2,\,j},\,\cdot ,\,\cdot )\big \Vert ^p_{L^p({\mathbb {R}}_\lambda )}\nonumber \\&\lesssim 1. \end{aligned}$$
(5.29)

Moreover, by (5.28), we then have that for any \(t_1,\,t_2,\,x_1,\,x_2\in {\mathbb {R}}_+\),

$$\begin{aligned} F^q(t_1, t_2, x_1, x_2)&\lesssim {P^{[\lambda ]}_{t_1}}{P^{[\lambda ]}_{t_2}}(h)(x_1, x_2). \end{aligned}$$

Therefore,

$$\begin{aligned} {[}u^*(x_1, x_2)]^q\le {\sup _{\genfrac{}{}{0.0pt}{}{t_1>0}{t_2>0}}}F^q(t_1,t_2, x_1, x_2)\lesssim {\mathcal {R}}_P(h)(x_1, x_2). \end{aligned}$$

By this together with the \({L^r({\mathbb {R}}_\lambda )}\)-boundedness of \({\mathcal {R}}_P\) and (5.29), we then have

$$\begin{aligned} \Vert u^*\Vert ^p_{L^p({\mathbb {R}}_\lambda )}=\Vert (u^*)^q\Vert _{L^r({\mathbb {R}}_\lambda )}^r\lesssim \Vert {\mathcal {R}}_P(h)\Vert ^r_{L^r({\mathbb {R}}_\lambda )}\lesssim \Vert h\Vert ^r_{L^r({\mathbb {R}}_\lambda )}\lesssim 1. \end{aligned}$$

This finishes the proof of Theorem 1.8. \(\square \)