1 Introducrion

Approximation to holomorphic functions as a general topic has been studied by many authors with a long history [1,2,3,4,5,6]. Some recent studies in relation to approximation in some holomorphic function spaces can be found in [7, 8]. Apart from the classical Hardy space, the existing studies in general weighted Hardy spaces do not address the effectiveness issue, nor adaptive construction, of the approximation. The main technical effort of this paper is to construct explicitly the rational orthonormal systems with fast convergence effect in a class of weighted Hardy spaces. The closest study to this paper includes [9, 10]. The theory developed in this study ensures rapid convergence by allowing repeating selection of the kernel parameters. The study spells out in view of the maximal selection principle why repeating selections of the parameters is necessary and why this procedure involves consecutively one and higher order directional derivatives. The idea of Pre-Adaptive Fourier Decomposition (POAFD) method on the weighted Hardy spaces stems from and further develop [9, 10]. POAFD is reduced to adaptive Fourier decomposition (AFD) in the classical Hardy space context, the latter being motivated by positive and non-linear instantaneous frequency representation of signals, and merging into, and standing as a new type and one of the most effective sparse representation programs with ample applications in industrial and biomedical signal analysis [11, 12], image analysis [13], as well as in system identification [14, 15]. The usual greedy algorithms do not allow repeating selection of parameters and thus cannot reach the best approximation. The AFD theory is briefly introduced as follows.

It is known that a function \(f\in {\mathbb {L}}^2({\partial } \mathbf{D})\) with Fourier expansion \(f(e^{it})=\sum _{n=-\infty }^\infty c_ne^{int}\) has its Hardy space decomposition \(f=f^++f^-,\) where \(f^+(e^{it})=\sum _{n=0}^\infty c_ne^{int}\in {\mathbb H}^2_+(\partial \mathbf{D}),\) and \(f^-(e^{it})=\sum _{n=-\infty }^{-1} c_ne^{int}\in {\mathbb H}^2_-(\partial \mathbf{D}).\) This decomposition corresponds to the space direct sum decomposition

$$\begin{aligned} {\mathbb {L}}^2(\partial \mathbf{D})={\mathbb {H}}^2_+(\partial \mathbf{D})\oplus {\mathbb {H}}^2_-(\partial \mathbf{D}), \end{aligned}$$
(1.1)

where \({\mathbb {H}}^2_\pm (\partial \mathbf{D})\) are, correspondingly, the non-tangential boundary limits of the functions in the Hardy spaces inside and outside the unit disc, respectively, denoted \({\mathbb {H}}^2_\pm (\mathbf{D}).\) The projection operators from f to \({\mathbb {H}}^2_\pm (\partial \mathbf{D})\) are respectively denoted as \(P_\pm (f)=(1/2)(f\pm iHf\pm c_0).\) The projections can also be obtained through boundary limits of the corresponding Cauchy integrals of the boundary data f. If f is real-valued, due to the property \(c_{-n}=\overline{c}_n,\) one has \(f=2\mathrm{Re}f^+-c_0.\) This suggests that analysis of the \(L^2\) functions may be reduced to analysis of the functions in the corresponding Hardy spaces [16]. Let \(\{T_k\}\) denote the rational orthogonal system, corresponding to a sequence of numbers \(\mathbf{a}=(a_1,\ldots ,a_n,\ldots ),\) allowing repetition, in the unit disc, of the form

$$\begin{aligned} T_k(z)=\frac{\sqrt{1-|a_k|^2}}{1-\overline{a}_kz}\prod _{l=1}^{k-1}\frac{z-a_l}{1-\overline{a}_lz}. \end{aligned}$$

\(\{T_k\}_{k=1}^\infty \) is an orthonormal system, also called Takenaka-Mulmquist or TM system, in \({\mathbb {H}}^2_+(\mathbf{D}).\) The system may or may not be a basis of \({\mathbb {H}}^2_+(\mathbf{D})\) depending on whether \(\sum _{k=1}^\infty (1-|a_k|)=\infty \) or not, respectively. In rational approximation of the Hardy space the above defined rational orthonormal systems are essential. For a given Hardy space function \(f^+,\) by selecting parameters \(a_k\)’s under the maximal selection principle,

$$\begin{aligned} a_k=\arg \sup \{ |\langle f, T^b_k \rangle |\ :\ b\in {\mathbf{D}}, b\ne a_l, l=1,\ldots , k-1\}, \end{aligned}$$

where \(T_k^b\) is the \(T_n\) with \(a_n\) being replaced by the undetermined b,  regardless whether the corresponding TM system is a basis or not, the resulted expansion, called AFD expansion,

$$\begin{aligned} \sum _{k=1}^\infty \langle f,T_k\rangle T_k(z) \end{aligned}$$

converges at fast pace [9] to \(f^+(z).\) One extra property of the decomposition, but important in signal analysis [14,15,16], and as the original motivation, is that under the selection \(a_1=0\) all \(T_k\)’s are of positive frequency (boundary phase derivative). AFD thus gives rise to intrinsic positive frequency decomposition of \(f^+\) and thus that of f as well. Other related work can be found in [17, 18]. In the analytic adaptive and positive-frequency approximation aspects the studies of Coifman et al. and Qian et al. merged together [19, 20]. Related to TM systems is the Beurling–Lax type direct sum decomposition

$$\begin{aligned} {\mathbb {H}}^2(\mathbf{D})=\overline{{\mathrm{span }}}\{{ T_k}\}_{{ k}=1}^\infty \oplus \phi {\mathbb {H}}^2(\mathbf{D}), \end{aligned}$$
(1.2)

where \(\phi \) is the Blaschke product defined by the parameters \(a_1,\ldots ,a_k,\ldots \). The relevance of the direct sum decomposition rests on the condition \(\sum _{k=1}^\infty (1-|a_k|)<\infty \) under which \(\phi \) is well defined and the TM system is not a basis.

The aim of this study is to generalize the AFD rational approximation theory of the classical Hardy space to a unified class named as square-integrable\(\beta \)-weighted Hardy spaces, including the Bergman space and weighted Bergman spaces, as well as the Hardy–Sobolev spaces, namely, the spaces

$$\begin{aligned} {\mathbb {H}}^2_\beta ({\mathbf{D}})= & {} \left\{ f: {\mathbf{D}}\rightarrow \mathbf{C} : f(z)=\sum _{k=0}^\infty c_kz^k,\quad z\in {\mathbf{D}},\right. \nonumber \\ \Vert f\Vert _{{\mathbb {H}}^2_\beta }^2= & {} \left. \sum _{k=0}^\infty (k+1)^\beta |c_k|^2<\infty \right\} , \end{aligned}$$
(1.3)

where \(-\infty<\beta <\infty .\)\({\mathbb {H}}^2_\beta \) is a particular case of \({\mathbb {H}}^2_W.\) We note that the inequality condition in the last definition guarantees that f is a well defined function holomorphic in the open unit disc \(\mathbf{D}.\) The conventional Hardy–Sobolev spaces \({\mathbb {W}}_{\beta }^2, \beta >0,\) is defined as

$$\begin{aligned} {\mathbb {W}}_{\beta }^2=\left\{ f:{\mathbf{D}}\rightarrow \mathbf{C} : f(z)=\sum _{k=0}^\infty c_kz^k, \sum _{k=0}^\infty (1+k)^{2\beta }|c_k|^2<\infty \right\} . \end{aligned}$$

Therefore, \(\mathbb {H}^2_{\beta }= {\mathbb {W}}_{\beta /2}^2.\)

The general weighted Hardy spaces defined as

$$\begin{aligned} {\mathbb {H}}^2_W({\mathbf{D}})= & {} \left\{ f: {\mathbf{D}}\rightarrow \mathbf{C} : f(z)=\sum _{k=0}^\infty c_kz^k, \quad z\in {\mathbf{D}},\right. \nonumber \\ \Vert f\Vert _{{\mathbb {H}}^2_W}^2= & {} \left. \sum _{k=0}^\infty W(k) |c_k|^2<\infty \right\} , \end{aligned}$$
(1.4)

where the weight sequence \(W(k)\ge 0, \lim _{k\rightarrow \infty }W(k)^{\frac{1}{k}}\ge 1\) [21]. Except for the occasional cases, such as in Sect. 4 when we discuss the Bergman space in the upper-half plane, we often suppress, in the notation for the function space, the part to indicate the domain of the functions. That is, we simplify the notation \({\mathbb {H}^2_{\beta }} (\mathbf{D})\) to be \({\mathbb {H}^2_{\beta }}.\)

The particular cases \({\mathbb {H}}^2_0, {\mathbb {H}}^2_{-1}, {\mathbb {H}}^2_{1}\) and \({\mathbb {H}}^2_2\) respectively correspond to the Hardy, the classical Bergman, the Dirichlet, and the Hardy–Sobolev \({\mathbb {W}}_1^2\) spaces. In general, \({\mathbb {H}}^2_\beta \) with \(\beta <0\) correspond to the weighted Bergman space \({\mathbb A}^2_\alpha , \) with \(\beta =-(1+\alpha ), \alpha >-1\) (see Sect. 5). The meaning of the space correspondence is that as function sets they are equal, while the norms are not necessarily equal but at least equivalent [22]. Due to the correspondence with the specially named spaces the approximation behaviors are just the same. In below we concentrate in studying the \(\beta \)-weighted Hardy spaces \({\mathbb {H}}^2_\beta .\)

It is a known fact that all the weighted Bergman spaces corresponding to \(\mathbb {H}^2_{\beta }, \beta <0,\) are reproducing kernel Hilbert spaces (RKHSs) (see Sect. 5). For \(\beta >0,\) by invoking the fact that a Hilbert space is a reproducing kernel Hilbert space if and only if the point-evaluating linear functional is a bounded functional, we can conclude that \(\mathbb {H}^2_{\beta }\) for \(\beta >0\) are also RKHSs. In fact, due to the set inclusion relation \({\mathbb {H}}^2_\beta \subset {\mathbb {H}}^2={\mathbb {H}}^2_0\) there follows

$$\begin{aligned} |f(z)|\le C_z\Vert f\Vert _2\le C_z \Vert f\Vert _{\mathbb {H}^2_{\beta }}. \end{aligned}$$
(1.5)

This implies that \({\mathbb {H}}^2_\beta , \beta >0,\) are RKHSs.

POAFD proposed in [10] is not valid for general RKHS. It is available only for those that possess the so called boundary vanishing property (BVP). BVP corresponds, in fact, the validity of the Riemann–Lebesgue Lemma in each RKHS context. It does not always hold. We will show that for \(\beta >1\) BVP does not hold in \(\mathbb {H}^2_{\beta }.\) As a result POAFD cannot be performed, and approximation in the latter spaces can be obtained through the inversed operators in \({\mathbb {H}}^2_{\beta '}, \beta =[\beta ]+\beta ', \beta '\in [0,1),\) or be alternatively obtained through a weak-POAFD method (see Sect. 6).

The formulation of approximation in non-Hardy reproducing kernel Hilbert spaces is difficult due to the fact that there is no useful inner function theory, and the multiple operators by inner functions are not norm-equivalent, nor contracting. Naturally related to our approximation approach is characterization of zero-based shift-invariant subspaces. In the Hardy space cases with the classical settings there exists the Beurling–Lax Theorem. The remarkable difference between the Hardy space and the non-Hardy reproducing kernel Hilbert spaces is characterization of the zero-based invariant subspaces. The characterization problem is an open problem for each of the non-Hardy space case \(\mathbb {H}^2_{\beta }, \beta \ne 0.\) For instance, for the standard Bergman space \({\mathbb A}^2={\mathbb A}_0^2,\)

$$\begin{aligned} \mathbb {A}^2(\mathbf{D})\ne \overline{\mathrm{span }}\{{ B_k}\}_{{ k}=1}^\infty \oplus { H}_\mathbf{a} \mathbb {A}^2(\mathbf{D}), \end{aligned}$$

where \(\{B_k\}_{k=1}^\infty \) is the Gram-Schmidt orthonormalization of a sequence of generalized reproducing kernels (see Sect. 2) corresponding to the infinite sequence \(\mathbf{a}\) inducing a Horowitz product \(H_\mathbf{a}\) ([23], also see Sect. 5). On the other hand, we show that in all cases the direct sum decomposition holds with the Beurling type shift invariant subspace being replaced by the zero-based invariant subspace, that is

$$\begin{aligned} \mathbb {H}^2_{\beta }=\overline{\mathrm{span }}\{{ B_k}\}_{{ k}=1}^\infty \oplus I_\mathbf{a}. \end{aligned}$$

The study on characterizing the zero-based shift-invariant subspaces has attracted many famous work [24,25,26,27,28].

The paper provides in detail the POAFD procedure for the standard Bergman space case. For the other cases, we indicate the differences and the necessary changes. The special case with the upper-half space is presented in great detail. Related error estimations are provided. A proof of the expansion convergence that does not depend on inner function properties is given.

The paper is organized as follows. In Sect. 2 we introduce basic knowledge of the Bergman spaces. Section 3 constructs the rational orthonormal system in the Bergman space that will be recalled when developing POAFD as an optimal approximation method in the Bergman space. Convergence rate and the related zero based invariant spaces are studied. In Sects. 4 and 5 we extend the theory established in the standard Bergman space to the Bergman space in the upper-half complex plane, and to the weighted Bergman spaces, respectively. In Sect. 6 we deal with the Hardy–Sobolev spaces corresponding to \(\beta >0.\) Apart from the theory aspect given in this paper we also achieved the related algorithm codes and experiments, and the application aspects, that will be left to the forthcoming complimentary paper.

2 Preliminaries

Denote by \({\mathbf{D}}\) the open unit disc in the complex plane \(\mathbf{C}.\) We will first deal with the square integrable Bergman space of the open unit disc, \(\mathbb {A}^2(\mathbf{D}),\) namely,

$$\begin{aligned} \mathbb {A}^2({\mathbf{D}})= & {} \left\{ f: {\mathbf{D}}\rightarrow \mathbf{C}\ | \ f\ \mathrm{is \ holomorphic \ in\ \mathbf{D}}, \mathrm{and}\ \Vert f\Vert _{\mathbb {A}^2({\mathbf{D}})}^2\right. \\= & {} \left. \int _{\mathbf{D}} |f(z)|^2dA<\infty \right\} , \end{aligned}$$

where dA is the normalized area measure on the unit disc: \(dA=\frac{dxdy}{\pi }, z=x+iy.\) The space \(\mathbb {A}^2(\mathbf{D})\) under the norm \(\Vert \cdot \Vert _{{\mathbb {A}^2}(\mathbf{D})}\) forms a Hilbert space with the inner product

$$\begin{aligned} \langle f,g\rangle _{\mathbb {A}^2(\mathbf{D})}=\int _{\mathbf{D}} f(z)\overline{g(z)}dA. \end{aligned}$$

In the sequel we often suppress the subscripts \(\mathbb {A}^2({\mathbf{D}})\) in the notations \(\Vert \cdot \Vert _{\mathbb {A}^2({\mathbf{D}})}\) and \(\langle \cdot ,\cdot \rangle _{\mathbb {A}^2({\mathbf{D}})}\), and write them simply as \(\Vert \cdot \Vert \) and \(\langle \cdot ,\cdot \rangle .\)

It is known that \(\mathbb {A}^2\) is a reproducing kernel Hilbert space with the reproducing kernel

$$\begin{aligned} k_a(z)=\frac{1}{(1-\overline{a}z)^2}. \end{aligned}$$

From the reproducing kernel property we have

$$\begin{aligned} \Vert k_a\Vert ^2= k_a(a)=\frac{1}{(1-|a|^2)^2}. \end{aligned}$$

Thus, the normalized reproducing kernel, denoted as \(e_a(z)\), is

$$\begin{aligned} e_a(z)=\frac{k_a(z)}{\sqrt{k_a(a)}}=\frac{1-|a|^2}{(1-\overline{a}z)^2}. \end{aligned}$$

We hence have, for any \(f\in \mathbb {A}^2,\)

$$\begin{aligned} \langle f,e_a\rangle = (1-|a|^2)f(a). \end{aligned}$$

Let \(\mathbf{a}=(a_1,\ldots ,a_n, \ldots )\) be an infinite sequence in \(\mathbf{D}.\) Here, and in the sequel, \(a_n\)’s are allowed to repeat. Denote by \(l(a_n)\) the number of repeating times of \(a_n\) in the n-tuple \(\mathbf{a}_n=(a_1,\ldots ,a_n).\) Denote by

$$\begin{aligned} \tilde{k}_{a_n}(z)=\left( \frac{d}{d\overline{w}}\right) ^{l(a_n)-1}\left( k_w(z)\right) |_{w=a_n}. \end{aligned}$$
(2.1)

We will call the sequence \(\{\tilde{k}_{a_n}\}\) the generalized reproducing kernels corresponding to\((a_1,\ldots ,a_n,\ldots )\).

There follows, for \(f\in \mathbb {A}^2,\)

$$\begin{aligned} \langle f, \tilde{k}_{a_n}\rangle =f^{(l(a_n)-1)}(a_n). \end{aligned}$$
(2.2)

To show (2.2), taking the \((l-1)\)-th derivative to the both sides of the identity

$$\begin{aligned} f(w)=\int _{\mathbf{D}} \frac{f(z)}{(1-w\overline{z})^2}dA, \end{aligned}$$

we have

$$\begin{aligned} f^{(l-1)}(w)=l!\int _{\mathbf{D}} \frac{\overline{z}^{l-1}f(z)}{(1-w\overline{z})^{l+1}}dA. \end{aligned}$$

The exchange of the orders of differentiation and integration is justified by the Lebesgue Dominated Convergence Theorem. We thus obtain (2.2). In the classical Hardy space case, this relation was noted by [10]. As explained in the following sections, the consecutive derivatives of the kernel function correspond to repeating use of the parameters \(a_n.\) A number of literature at this point do not bother to discuss the case of repeating use of the parameters (see, for instance, [6]). The present study shows that in order to obtain the best possible approximation at each step repeating selections of parameters are necessary, and shows how the adaptive selection procedure possibly involves higher and higher orders of derivatives.

Let M be a closed subspace of \(\mathbb {A}^2.\) If further \(zM\subset M,\) then we say that M is an shift-invariant subspace or simply invariant subspace. Let \(\mathbf{a}\) be a finite or infinite sequence of points in the unit disc. We will be interested in the zero-based invariant subspaces \(I_\mathbf{a}\) consisting of the functions that vanishes on the points of \(\mathbf{a}\) together with the multiples:

$$\begin{aligned} I_\mathbf{a}=\{f\in \mathbb {A}^2: f \ \mathrm{has \ all\ points \ in \ } \mathbf{a} \ \mathrm{together \ with \ their\ multiples\ as\ its\ zeros}\}. \end{aligned}$$

It is conventional that a is a zero with multiple l of an analytic function f if and only if \(f(z)=(z-a)^lg(z),\) where g is also analytic at a. The last fact is equivalent with \(f(a)=f'(a)=\cdots =f^{(l-1)}(a)=0.\) Denote \(\phi _a(z)=z-a.\) In the case \(I=I_{\mathbf{a}_1}=I_{\{a_1\}},\) we can easily show that \(I_{\{a_1\}}=\phi _{a_1}\mathbb {A}^2.\) Inductively, we have \(I_{\mathbf{a}_n}=\phi _{a_1}\cdots \phi _{a_n}\mathbb {A}^2,\) where \(\mathbf{a}_n=(a_1,\ldots ,a_n).\) This relation, however, is not extendable to an infinite sequence \(\mathbf{a}.\) The zero-based invariant spaces are closed subspaces, they themselves are reproducing kernel Hilbert spaces. Characterizations of zero-based invariant subspaces are of central importance and attract great interest of many researchers. In the Hardy spaces case there exist the Beurling and Beurling–Lax Theorems in relation to the inner functions in the contexts. Necessary and sufficient conditions for \(\mathbf{a}\) being the zero set of some Bergman space function has ever since several decades ago been an open problem [29,30,31]. The existing partial results along this direction contribute to our understanding on when a maximally selected parameter sequence gives rise to a basis (see Sect. 3).

3 Pre-orthogonal Adaptive Fourier Decomposition in the Bergman Space

In this section we first construct for a given infinite sequence \(\mathbf{a}\) the corresponding rational orthogonal system of the Bergman space, called the Bergman space rational orthogonal (BRO) system. Secondly, we show the validity of the so called boundary vanishing condition, and subsequently the validity of the maximal selection principle. Under the maximal selections of the parameters the convergence theorem in the Bergman space is proved. The direct sum decomposition holds as a consequence. To the end of the section we study the convergence rate of the pre-orthogonal adaptive Fourier decomposition.

3.1 Rational Orthogonal System of \(\mathbb {A}^2\)

For any given infinite sequence \(\mathbf{a}=(a_1,\ldots ,a_n,\ldots )\) in the unit disc \(\mathbf{D}\) we have

Theorem 3.1

Let \((a_1,\ldots , a_n, \ldots )\) be an infinite sequence in \(\mathbf{D},\) where each \(a_n\) in the sequence is allowed to repeat. Denote \(A_n=(a_1,\ldots , a_n).\) Let the orthonormalization of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_n})\) be denoted by \((B_1,\ldots ,B_n).\) Let \(w\in {\mathbf{D}}\) be different from any \(a_1,\ldots ,a_n.\) Then for any positive integer n, (i) the reproducing kernel of the zero-based invariant subspace \(I_{A_n}\) is

$$\begin{aligned} K_{A_n}(z,w)=\sqrt{\Vert k_w\Vert ^2-\sum _{k=1}^n |\langle k_w, B_k\rangle |^2}B_{n+1}^w,\ K_{A_0}(z,w)=K(z,w), \end{aligned}$$

where \((B_1,\ldots ,B_n,B_{n+1}^w)\) is the orthonormalization of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_n},k_w);\) (ii) if \(a_{n+1}\) coincides with some \(a_k, k=1,\ldots ,n,\) that is, \(l(a_{n+1})>1,\) then \(B_{n+1}^{a_{n+1}}\) defined through \(\lim _{w\rightarrow a_{n+1}}B_{n+1}^w\) will satisfy

$$\begin{aligned} \lim _{w\rightarrow a_{n+1}}B_{n+1}^w=\frac{\tilde{k}_{a_{n+1}}(z)-\sum _{k=1}^n \langle \tilde{k}_{a_{n+1}}, B_k\rangle B_k (z)}{\sqrt{\Vert \tilde{k}_{a_{n+1}}\Vert ^2-\sum _{k=1}^n|\langle \tilde{k}_{a_{n+1}}, B_k\rangle |^2}}. \end{aligned}$$

That is, \(B^{a_{n+1}}_{n+1}=B_{n+1}.\)

Proof

(i) Let w be none of \(a_1,\ldots ,a_n.\) Under the Gram–Schmidt (G–S) orthonormalization process, \(B_{n+1}^w\) is given by

$$\begin{aligned} B_{n+1}^w(z)=\frac{k_{w}(z)-\sum _{k=1}^n \langle k_{w}, B_k\rangle B_k (z)}{\sqrt{\Vert k_w\Vert ^2-\sum _{k=1}^n|\langle k_w, B_k\rangle |^2}}. \end{aligned}$$
(3.1)

We are to show that

$$\begin{aligned} K_{A_n}(\cdot ,w)= & {} \sqrt{\Vert k_w\Vert ^2-\sum _{k=1}^n|\langle k_w, B_k\rangle |^2}B_{n+1}^w(z)\\= & {} k_{w}-\sum _{k=1}^n \langle k_{w}, B_k\rangle B_k (z) \end{aligned}$$

is the reproducing kernel of \(I_{A_n}.\) To this end we will show (1) \(K_{A_n}(\cdot ,w)\) itself belongs to the space \(I_{A_n};\) and (2) For \(f=\phi _{a_1}\cdots \phi _{a_n}g, g\in \mathbb {A}^2,\) there holds \(\langle f, K_{A_n}(\cdot ,w)\rangle =f(w).\)

Now we show (1). From the construction of \(K_{A_n}(\cdot ,w)\) we know that it is orthogonal with all \(B_1,\ldots B_n,\) and hence orthogonal with all \(\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_n}.\) Let a be a complex number appearing in the sequence \((a_1,\ldots ,a_n)\) with the repetition number l. Then \(k_a, \frac{\partial }{\partial \overline{w}}k_a,\ldots ,\left( \frac{\partial }{\partial \overline{w}}\right) ^{l-1}k_a\) will appear in \(\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_n}.\) By invoking (2.2), the function \(K_{A_n}(\cdot ,w)\) being orthogonal with \(k_a, \frac{\partial }{\partial \overline{w}}k_a,\ldots ,\left( \frac{\partial }{\partial \overline{w}}\right) ^{l-1}k_a\) means that \(K_{A_n}(a,w)=\frac{\partial }{\partial z}K_{A_n}(a,w)=\cdots =\left( \frac{\partial }{\partial z}\right) ^{l-1}K_{A_n}(a,w)=0.\) The latter implies that \(K_{A_n}(z,w)\) has the multiplicative factor \((z-a)^l.\) By applying this argument for all a appearing in the sequence \((a_1,\ldots ,a_n)\) we obtain that for each w, \(K_{A_n}(\cdot ,w)\) is in \(I_{A_n}.\)

Next we show (2), that is, for \(f=\phi _{a_1}\cdots \phi _{a_n}g, g\in \mathbb {A}^2,\) there holds \(\langle f, K_{A_n}(\cdot ,w)\rangle =f(w).\) We first claim that

$$\begin{aligned} \langle \phi _{a_1}\cdots \phi _{a_n}g, \tilde{k}_{a_k}\rangle =0,\ \ \ \ \ k=1,2,\ldots ,n. \end{aligned}$$
(3.2)

Let \(a_k\) repeat l times in \((a_1,\ldots ,a_n)\ \) and \(l(a_k)\) times in \((a_1,\ldots ,a_k).\) We always have \(l\ge l(a_k).\) In such notation \((\phi _{a_k})^l\) is a factor of \(\phi _{a_1}\cdots \phi _{a_n},\) and for some \(h\in \mathbb {A}^2,\ \phi ^l_{a_k}h=\phi _{a_1}\cdots \phi _{a_n}g.\) By (2.2), we have

$$\begin{aligned} \langle \phi _{a_1}\cdots \phi _{a_n}g, \tilde{k}_{a_k}\rangle= & {} \langle \phi ^l_{a_k} h, \left( \frac{\partial }{\partial \overline{w}}\right) ^{l(a_k)-1}(k_{w})|_{w=a_k} \rangle \\= & {} (\phi ^l_{a_k}h)^{(l(a_k)-1)}(a_k)\\= & {} 0. \end{aligned}$$

Since each \(B_k\) is a linear combination of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_k}),\) we have, owing to (3.2),

$$\begin{aligned} \langle \phi _{a_1}\cdots \phi _{a_n}g, K_n(\cdot ,w)\rangle= & {} \langle \phi _{a_1}\cdots \phi _{a_n}g, k_{w}-\sum _{k=1}^n \langle k_{w}, B_k\rangle B_k (z)\rangle \\= & {} \langle \phi _{a_1}\cdots \phi _{a_n}g, k_w\rangle +\sum _{k=1}^n c_k \left\langle \phi _{a_1}\cdots \phi _{a_n}g, \tilde{k}_{a_k}\right\rangle \\= & {} \phi _{a_1}(w)\cdots \phi _{a_n}(w)g(w)\\= & {} f(w). \end{aligned}$$

Therefore, \(K_n(z,w)\) is the reproducing kernel of \(I_{A_n}.\) The proof of (2) is complete.

Now we prove (ii) that treats \(l=l(a_{n+1})>1.\) This means that the \(l-1\) terms \(k_{a_{n+1}}, \frac{\partial }{\partial \overline{w}}k_{a_{n+1}},\ldots , \)\(\left( \frac{\partial }{\partial \overline{w}}\right) ^{(l-2)}k_{a_{n+1}},\) as functions of the z variable, have appeared in the sequence \((\tilde{k}_{a_1}, \ldots ,\tilde{k}_{a_{n}}).\) Therefore, the function

$$\begin{aligned} k_{a_{n+1}}+\frac{\frac{\partial }{\partial \overline{w}}k_{a_{n+1}}}{1!}(\overline{w}-\overline{a}_{n+1})+\cdots +\frac{\left( \frac{\partial }{\partial \overline{w}}\right) ^{l-2}k_{a_{n+1}}}{(l-2)!}(\overline{w}-\overline{a}_{n+1})^{l-2}, \end{aligned}$$

as the order-\((l-2)\) Taylor expansion of the function \(k_w(z)\) at \(\overline{w}=\overline{a}_{n+1}\) is already in the linear span of \(B_1,\ldots , B_n.\) Denoting it by \(T_{l-2}(k_w, a_{n+1}),\) the last mentioned assertion amounts to the relation

$$\begin{aligned} T_{l-2}(k_w, a_{n+1})-\sum _{k=1}^n\langle T_{l-2}(k_w, a_{n+1}),B_k\rangle B_k=0. \end{aligned}$$
(3.3)

For w being different from all \(a_k, k=1,\ldots , n,\) we have, owing to (3.1),

$$\begin{aligned} \frac{k_{w}(z)-\sum _{k=1}^n \langle k_w, B_k\rangle B_k (z)}{\Vert k_{w}-\sum _{k=1}^n \langle k_w, B_k\rangle B_k \Vert }=B_{n+1}^w(z), \end{aligned}$$
(3.4)

Inserting (3.3) into (3.4), and dividing by \((\overline{w}-\overline{a}_{n+1})^{l-1}\) and \(|\overline{w}-\overline{a}_{n+1}|^{l-1}\) to the numerator and the denominator, respectively, we have

$$\begin{aligned} \frac{\frac{k_{w}(z)-T_{l\!-\!2}(k_{w}, a_{n\!+\!1})(z)}{(\overline{w}-\overline{a}_{n+1})^{l\!-\!1}} \!-\! \sum \nolimits _{k\!=\!1}^n \langle \frac{k_{w}-T_{l\!-\!2}(k_{w}, a_{n\!+\!1})(z)}{(\overline{w} \!-\! \overline{a}_{n\!+\!1})^{l\!-\!1}}, B_k\rangle B_k (z)}{\Vert \frac{k_{w}(z)-T_{l\!-\!2}(k_{w}, a_{n\!+\!1})(z)}{(\overline{w}\!-\!\overline{a}_{n+1})^{l-1}}\!-\!\sum \nolimits _{k\!=\!1}^n \langle \frac{k_{w}(z)-T_{l\!-\!2}(k_{w}, a_{n\!+\!1})(z)}{(\overline{w}\!-\! \overline{a}_{n\!+\! 1})^{l\!-\!1}}, B_k\rangle B_k \Vert }=-e^{(l-1)\theta }B_{n+1}^w(z),\nonumber \\ \end{aligned}$$
(3.5)

where \(\overline{w}-\overline{a}_{n+1}=|\overline{w}-\overline{a}_{n+1}|e^{i\theta }.\) Letting \(w\rightarrow a_{n+1}\), keeping the direction \(e^{i\theta },\) and using the Lagrange type remainder, we obtain

$$\begin{aligned} e^{(l-1)\theta }\frac{\tilde{k}_{a_{n+1}}(z)-\sum _{k=1}^n \langle \tilde{k}_{a_{n+1}}, B_k\rangle B_k (z)}{\Vert \tilde{k}_{a_{n+1}}-\sum _{k=1}^n \langle \tilde{k}_{a_{n+1}}, B_k\rangle B_k \Vert }=B_{n+1}(z). \end{aligned}$$

This shows that the G–S orthonormalization of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_{n+1}})\) is \((B_1,\ldots ,B_{n+1}).\) The proof of (ii) is complete. \(\square \)

In [6] the author obtains analogous results with the assumption that all \(a_1,\ldots , a_n\) are distinct.

Recalling that

$$\begin{aligned} I_\mathbf{a}=\{f\in \mathbb {A}^2: f(\mathbf a)=\mathrm 0\}, \end{aligned}$$

we have

Theorem 3.2

Let \(\mathbf{a}\) be any sequence in \(\mathbf{D}\) and h the infinite error function

$$\begin{aligned} h=f-\sum _{k=1}^\infty \langle f,B_k\rangle B_k . \end{aligned}$$
(3.6)

Then there holds \(h\in I_\mathbf{a}.\) Moreover,

$$\begin{aligned} \mathbb {A}^2=\overline{\mathrm{span}}\{B_k\}_{k=1}^\infty \oplus I_\mathbf{a}. \end{aligned}$$
(3.7)

Proof

First, the Riesz–Fisher Theorem gives

$$\begin{aligned} h=f-\sum _{k=1}^\infty \langle f, B_k\rangle B_k\in \mathbb {A}^2. \end{aligned}$$

Next, the fact \(h\perp B_k\) for all k implies \(h\perp \tilde{k}_{a_k}\) for all k that further implies that h has \(\mathbf{a}\) as its zeros including the multiples. So, the LHS of (3.7) is a subset of the RHD. The inverse inclusion is obvious. The proof is complete. \(\square \)

The last theorem does not exclude the case \(I_\mathbf{a}=\{0\}.\) For such case we have

Corollary 3.3

\(\{B_k\}_{k=1}^\infty \) is a basis if and only if \(I_\mathbf{a}=\{0\}.\)

It is an interesting question that for what \(\mathbf{a}\) there holds \(I_\mathbf{a}\ne \{0\},\) and thus \(\{B_k\}_{k=1}^\infty \) is not a basis. This, however, has been an open question till now. Studies show that no condition only on the magnitudes of \(a_k\)’s can guarantee \(\mathbf{a}\) to be the zero sequence of some non-trivial \(f\in \mathbb {A}^2.\) The following remark is based on the existing literature [23, 30, 32].

Remark 3.4

Let \(\Omega =\prod _{j=1}^\infty [0,2\pi ).\) Let \(\mu _j\) be the normalized Lebesgue measure on the j-th factor of \(\Omega .\) Then let \(\mu \) be the probability measure on \(\Omega \) such that \(\mu =\prod _{k=1}^\infty \mu _j\)[30, 32]. Consider the map of \(\Omega \) into the holomorphic function set \(H(\mathbf{D})\) defined by \(\omega \rightarrow H_\mathbf{a}\) such that \(a_j=r_je^{i\omega _j}\) and \(\omega =\{\omega _j\}_{j=1}^\infty .\) Then under the condition

$$\begin{aligned} \lim \sup _{\varepsilon \rightarrow 0+} \frac{\sum _{j=1}^\infty (1-r_j)^{1+\varepsilon }}{\log \frac{1}{\varepsilon }}<1/4, \end{aligned}$$
(3.8)

there holds that \(H_\mathbf{a}\in \mathbb {A}^2\) for \(\mu \)-a.e. \(\omega \) [30].

Based on the above result, as well as Corollary 3.3, we conclude

Theorem 3.5

Under the condition (3.8), \(\{B_j\}_{j=1}^\infty \) is almost surely not a basis.

3.2 Formulation of Pre-orthogonal Adaptive Fourier Decomposition in the Bergman Space \(\mathbb {A}^2\)

The purpose of this section is to introduce a machinery that gives rise to fast rational approximation to functions in the Bergman space. The machinery is an adaptation of a general method called pre-orthogonal adaptive Fourier decomposition or POAFD for the reproducing kernel Hilbert spaces possessing the boundary vanish property (BVP) [10, 18]. It is the pre-orthogonal process that arises repeating selection of parameters, when necessary, and therefore the necessity of successive derivatives of the reproducing kernel.

Let \(f\in \mathbb {A}^2.\) Denote \(f=f_1.\) For the normalized reproducing kernel \(e_{a_1}\) and any \(a_1\in \mathbf{D},\) we have the identity

$$\begin{aligned} f(z)=\langle f_1, e_{a_1}\rangle e_{a_1}(z) + f_2(z), \end{aligned}$$

where the standard remainder\(f_2=f_1-\langle f_1, e_{a_1}\rangle e_{a_1}\) is orthogonal with \(e_{a_1}.\) We thus have

$$\begin{aligned} \Vert f\Vert ^2=|\langle f_1,e_{a_1}\rangle |^2+\Vert f_2\Vert ^2. \end{aligned}$$

The strategy is to maximize the value of \(|\langle f_1,e_{a_1}\rangle |^2\) through selections of \(a_1,\) and thus to minimize the energy of \(f_2.\) Despite of the fact that \(\mathbf{D}\) is an open set, we have

Lemma 3.6

For any \(f\in \mathbb {A}^2\) there holds

$$\begin{aligned} \lim _{|b|\rightarrow 1}|\langle f,e_{b}\rangle |=0. \end{aligned}$$

As a consequence, there exists \(a\in \mathbf{D}\) such that

$$\begin{aligned} |\langle f,e_{a}\rangle |^2=\max \{ |\langle f,e_{b}\rangle |^2\ :\ b\in \mathbf{D}\}. \end{aligned}$$

Proof

The Cauchy-Schwarz inequality gives

$$\begin{aligned} |\langle f,e_b\rangle |\le \Vert f\Vert . \end{aligned}$$

Therefore \( |\langle f,e_b\rangle | \) has a finite upper bound.

For \(\varepsilon >0,\) due to density of polynomials in the Bergman space, there exists a polynomial g such that

$$\begin{aligned} \Vert f-g\Vert \le \varepsilon . \end{aligned}$$

The Cauchy-Schwarz inequality gives

$$\begin{aligned} |\langle f,e_b\rangle |\le |\langle g,e_b\rangle |+\varepsilon =(1-|b|^2)|g(b)|+\varepsilon . \end{aligned}$$

The last quantity tends to zero as \(|b|\rightarrow 1.\) A Bolzano–Weierstrass type compact then concludes that \(|\langle f,e_b\rangle |\) attains the maximum value at an interior point. \(\square \)

For \(f_1\in \mathbb {A}^2\) choose \(a_1\) such that

$$\begin{aligned} |\langle f_1,e_{a_1}\rangle |^2=\max \left\{ |\langle f_1,e_{b}\rangle |^2 : b\in \mathbf{D}\right\} . \end{aligned}$$

Fixing such \(a_1\) and letting \(f_{n+1}(z)=f_n-\langle f_n, B_n^b\rangle B_n^b(z)\), we have the identity

$$\begin{aligned} f(z)= & {} \langle f_1, B_1\rangle B_1(z)+\cdots +\langle f_{n-1}, B_{n-1}\rangle B_{n-1}(z)\nonumber \\&\quad + \left\langle f_n, B_n^b\right\rangle B_n^b(z) + f_{n+1}, \end{aligned}$$
(3.9)

where \(B_1=e_{a_1},\) and \(a_1,\ldots , a_{n-1},\) are preciously selected, \(\{B_1,\ldots B_{n-1}\}\) is the G–S orthonormalization of \(\{\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_{n-1}}\},\) and \(\{B_1,\ldots ,B_{n-1},B_n^b\}\) is the G–S orthonormalization of \(\{B_1,\ldots ,B_{n-1},\tilde{k}_{b}\}.\) For the time being we assume b is not among \(a_1,\ldots ,a_{n-1},\) that is \(l(b)=1\) and \(\tilde{k}_{b}={k}_{b}.\) Expressing \(B_n^b\) by (3.1), due to the orthogonality between \(f_n\) and \(B_{k}, k=1,\ldots , n-1,\) we have

$$\begin{aligned} \langle f_n,B_n^b\rangle =\left\langle f_n, \frac{e_b}{\sqrt{1-\sum _{k=1}^{n-1}|\langle e_b,B_k\rangle |^2}} \right\rangle . \end{aligned}$$

Using the Cauchy-Schwarz inequality to \(|\langle f_n,B_n^b\rangle |,\) we have that the complex module of the above inner product, first, has a finite upper bound independent of \(b\in \mathbf{D}.\) We will show that the module of the complex number, depending on b, reaches its global maximum at an interior point \(b=a_n\in \mathbf{D}.\) To this end it suffices to show that the module tends to zero as \(|b|\rightarrow 1.\) The quantity can be decomposed into two parts, as

$$\begin{aligned} \left\langle f_n, \frac{e_b}{\sqrt{1-\sum _{k=1}^{n-1}|\langle e_b,B_k\rangle |^2}} \right\rangle= & {} \left\langle f_n, \frac{e_b}{\sqrt{1-\sum _{k=1}^{n-1}|\langle e_b,B_k\rangle |^2}}- e_b \right\rangle + \langle f_n, e_b\rangle \\= & {} I_1(b)+I_2(b). \end{aligned}$$

We have

$$\begin{aligned} |I_1(b)|\!\le \!\int _\mathbf{D} |f_n||e_b|\left( \frac{1}{\sqrt{1-\sum _{k=1}^{n-1}|\langle e_b,B_k\rangle |^2}}\!-\!1\right) dA. \end{aligned}$$
(3.10)

Using Lemma 3.6 to each term \(|\langle e_b,B_k\rangle |\) of the summation, we obtain that, if |b| is close to 1,  the difference in the round brackets is dominated by any previously set \(\varepsilon >0.\) By taking this \(\varepsilon \)-domination out from (3.10) and applying the Cauchy-Schwarz inequality to the rest integration on \(|f_n||e_b|,\) we conclude

$$\begin{aligned} \lim _{|b|\rightarrow 1-}I_1(b)=0. \end{aligned}$$

The maximal selection Lemma 3.6 implies \(\lim _{|b|\rightarrow 1-}I_2(b)=0 .\) To summarize, we have

$$\begin{aligned} \lim _{|b|\rightarrow 1-} |\langle f_n,B_n^b\rangle |=0. \end{aligned}$$
(3.11)

The process to reach and attain the maximal value of \(|\langle f_n,B_n^b\rangle |\) can always be made from a sequence \(\{b_k\}_{k=1}^\infty \) in \(\mathbf{D},\) where for each \(k, b_k\ne a_l, l=1,\ldots ,n,\) and \(\lim _{k\rightarrow \infty }b_k=a_n.\) If \(a_n\ne a_1,\ldots ,a_{n-1},\) i.e., \(l(a_n)=1,\) then one simply has \((B_1,\ldots ,B_n)=(B_1,\ldots ,B^{a_n}_n),\) the latter being the G–S orthonormalization of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_{n-1}}, k_{a_n}).\) If, \(l(a_n)>1,\) then according to Theorem 3.1, one also has \((B_1,\ldots ,B_n)=(B_1,\ldots ,B^{a_n}_n),\) but the latter is the G–S orthonormalization of \((\tilde{k}_{a_1},\ldots , \tilde{k}_{a_n}).\) Fixing such a maximal selection \(a_n\) we have that \(f_{n+1}\) defined by (3.9) is in the orthogonal complement of the span of \(B_1,\ldots ,B_n,\) and \( \langle f_1,B_1\rangle B_1+\cdots +\langle f_n,B_n\rangle B_n\) is the orthogonal projection of f into the span of \((B_1,\ldots ,B_n).\) We therefore have

$$\begin{aligned} \Vert f\Vert ^2=\sum \limits _{k=1}^n|\langle f,B_k\rangle |^2+\Vert f_{n+1}\Vert ^2. \end{aligned}$$

We note that \(f_{n+1}\in I_{\{a_1,\ldots ,a_n\}}.\)

The above argument is summarized as the maximal selection principle stated as

Theorem 3.7

For any \(f\in \mathbb {A}^2\) and any positive integer n there exists \(a_n\in \mathbf{D}\) such that

$$\begin{aligned} \left| \left\langle f,B^{a_n}_{n}\right\rangle \right| ^2=\sup \left\{ \left| \left\langle f,B_{n}^b\right\rangle \right| ^2 : b\in \mathbf{D}\right\} , \end{aligned}$$
(3.12)

where for any \(b\in \mathbf{D},\)\((B_1,\ldots ,B_{n-1},B^b_n)\) is the G–S orthogonalization of \((\tilde{k}_{a_1},\ldots ,\tilde{k}_{a_{n-1}},\tilde{k}_b)\) corresponding to the n-tuple \((a_1,\ldots ,a_{n-1},b).\)

We next prove

Theorem 3.8

Let f be any function in \(\mathbb {A}^2.\) Under the maximal selections of the parameters \((a_1,\ldots ,a_n,\cdots )\) there holds

$$\begin{aligned} f=\sum _{k=1}^\infty \langle f,B_k\rangle B_k. \end{aligned}$$

Proof

We prove the desired convergence by contradiction. Assume that through a sequence of maximally selected parameters \(\mathbf{a}=(a_1,\ldots ,a_n,\cdots )\) we have

$$\begin{aligned} f=\sum _{k=1}^\infty \langle f,B_k\rangle B_k + h, \qquad h\ne 0. \end{aligned}$$
(3.13)

By the Bessel inequality and the Riesz–Fisher Theorem \(\sum _{k=1}^\infty \langle f,B_k\rangle B_k \in \mathbb {A}^2,\) and hence \(h\in \mathbb {A}^2.\) We separate the series into two parts

$$\begin{aligned} f=\sum _{k=1}^M + \sum _{k=M+1}^\infty \langle f,B_k\rangle B_k + h, \end{aligned}$$

where by our notation,

$$\begin{aligned} f_{M+1}=\sum _{k=M+1}^\infty \langle f,B_k\rangle B_k + h =g_{M+1}+h. \end{aligned}$$

In the last equality, due to the orthogonality, we may replace f with \(f_k\) and have

$$\begin{aligned} f_{M+1}=\sum _{k=M+1}^\infty \langle f_k,B_k\rangle B_k + h. \end{aligned}$$

Since the set of the reproducing kernels of \(\mathbb {A}^2\) is dense, there exists \(a\in \mathbf{D}\) such that \(\delta \triangleq |\langle h,e_a\rangle |>0.\) We can, in particular, choose a to be different from all the \(a_k\)’s in the sequence \(\mathbf{a}.\) The contradiction that we are going to introduce will be in relation to the maximal selections of \(a_{M+1}\) when M is large. Now, on one hand, the Bessel inequality implies

$$\begin{aligned} \lim _{k\rightarrow \infty }|\langle f_{k},B_{k}\rangle |= 0, \end{aligned}$$

and thus the maximal selection of \(a_{M+1}\) implies

$$\begin{aligned} \left| \left\langle f_{M+1},B_{M+1}^{a}\right\rangle \right| \le |\langle f_{M+1},B_{M+1}\rangle |\rightarrow 0, \quad \mathrm{as}\quad M\rightarrow \infty . \end{aligned}$$
(3.14)

On the other hand, as we will show, for large M,

$$\begin{aligned} \left| \left\langle f_{M+1},B_{M+1}^{a}\right\rangle \right| > \frac{\delta }{2}. \end{aligned}$$
(3.15)

This is then clearly a contradiction.

The rest part of the proof is devoted to showing (3.15). Due to the quantity relations

$$\begin{aligned} \left| \left\langle f_{M+1}, B_{M+1}^a \right\rangle \right| \ge \left| \left\langle h, B_{M+1}^a\right\rangle \right| -\left| \left\langle g_{M+1}, B_{M+1}^a \right\rangle \right| \end{aligned}$$
(3.16)

and

$$\begin{aligned} \left| \left\langle g_{M+1}, B_{M+1}^a \right\rangle \right| \le \Vert g_{M+1}\Vert \rightarrow 0, \qquad \mathrm{as}\quad M\rightarrow \infty , \end{aligned}$$
(3.17)

we see that if M grows large, the lower bounds of \(|\langle f_{M+1}, B_{M+1}^a\rangle |\) depend on those of \(|\langle h, B_{M+1}^a\rangle |\). To analyze the lower bounds of \(|\langle h, B_{M+1}^a\rangle |,\) for any positive integer M,  denote by \(X_{M+1}^{a}\) the \((M+1)\)-dimensional space spanned by \(\{{e}_{a}, {\tilde{k}}_{a_1},\ldots ,{\tilde{k}}_{a_M}\}.\) We have two methods to compute the energy of the projection of h into \(X_{M+1}^{a},\) denoted by \(\Vert h/X_{M+1}^a\Vert ^2.\) One method is based on the orthonormalization \((B_1,\ldots ,B_M,B_{M+1}^a).\) In such way, due to the orthogonality of h with \(B_1,\ldots ,B_M,\) we have

$$\begin{aligned} \left\| h/X_{M+1}^a\right\| ^2=\left| \left\langle h,B_{M+1}^a \right\rangle \right| ^2. \end{aligned}$$

The second way is based on the orthonormalization in the order \((e_a,{\tilde{k}}_{a_1},\ldots ,{\tilde{k}}_{a_M}).\) Then we have

$$\begin{aligned} \left\| h/X_{M+1}^a \right\| ^2\ge |\langle h,e_a\rangle |^2=\delta ^2. \end{aligned}$$

Hence we have, for M, \(|\langle h,B_{M+1}^a\rangle |\ge \delta .\) In view of this last estimation and (3.17), (3.16), we arrive at the contradiction spelt by (3.15), (3.14). The proof is thus complete. \(\square \)

We note that in [6] the author studies expansions of functions in the orthogonal complement of a zeros based invariant subspace where the distinguished zeros are assumed to be an interpolation sequence. In our setting the zero sequence, allowing multiples, is not preassumed to form an interpolating sequence and gives rise to fast-convergence-representation of the function to be expanded.

3.3 Convergence Rate of Pre-orthogonal Adaptive Fourier Decomposition

The POAFD approach is not classified into any existing category of the greedy algorithms. The differences between the methods in the AFD category, including POAFD, and the greedy algorithms include (i) In the AFD methods the parameters can be repeatedly selected; and (ii) At each of the iterative steps the maximal projection can be attained. Those, therefore, in general, offer approximation better than what are called greedy algorithms [10]. Below we prove the corresponding convergence rate.

Before we prove the convergence rate estimation we recall the following lemma whose proof can be found in the greedy algorithm literature [10].

Lemma 3.9

Assume that a sequence of positive numbers satisfies the conditions

$$\begin{aligned} d_1\le A, \quad d_{k+1}\le d_k\left( 1-\frac{d_k}{r_k^2A}\right) , \qquad k=1,2,\ldots \end{aligned}$$

Then there holds

$$\begin{aligned} d_k\le \frac{A}{1+\sum _{l=1}^k \frac{1}{r_l^2}}, \qquad k=1,2,\ldots \end{aligned}$$

For \(M>0\) we will be working with the subclass \({\mathbb {A}^2}_M\) of \(\mathbb {A}^2\) defined as

$$\begin{aligned} {\mathbb {A}^2}_{\!M\!} \!=\! \{ f\!\in \! \mathbb {A}^2: \mathrm{there\ exists}\ \{b_1,\!\ldots \!,b_n,\!\ldots \!\},\ f \!=\! \sum _{l=1}^\infty c_l{e}_{b_l}, \sum _{l=1}^\infty |c_l|\!\le \! M\}. \end{aligned}$$

Theorem 3.10

Let \(f=\sum _{l=1}^\infty c_l{e}_{b_l}\in {\mathbb {A}^2}_M,\) and \(f_n\) be the orthogonal standard remainder corresponding to the maximal selections of the \(a_k\)’s, then there exists estimation

$$\begin{aligned} \Vert f_k\Vert \le M\left( 1+\sum _{l=1}^k\left( \frac{1}{r_l}\right) ^2\right) ^{-\frac{1}{2}}, \end{aligned}$$

where \(r_k=\sup \{ r_k(b_l) : l=1,2,\ldots \},\) and \(r_k(b_l)=\sqrt{1-\sum _{t=1}^{k-1}|\langle {e}_{b_l},B_t\rangle |^2}=\Vert e_{b_l}-\sum _{t=1}^{k-1}\langle {e}_{b_l},B_t\rangle B_t\Vert .\)

Proof

We start from the inequality chain

$$\begin{aligned} |\langle f_k,B_k\rangle |\ge & {} \sup \left\{ \left| \left\langle f_k,B_k^a\right\rangle \right| : a\in \mathbf{D}\right\} \\\ge & {} \sup \left\{ \left| \left\langle f_k,B_k^{b_l}\right\rangle \right| : l=1,2,\ldots \right\} \\= & {} \sup \left\{ \frac{|\langle f_k,e_{b_l}\rangle |}{r_k(b_l)}: l=1,2,\ldots \right\} \\\ge & {} \frac{1}{r_k} \sup \left\{ {|\langle f_k,e_{b_l}\rangle |} : l=1,2,\ldots \right\} \\\ge & {} \frac{1}{r_k M} \left| \left\langle f_k,\sum _{l=1}^\infty c_l{e}_{b_l}\right\rangle \right| \\= & {} \frac{1}{r_k M} |\langle f_k,f\rangle |\\= & {} \frac{1}{r_k M} |\langle f_k,f_k\rangle | \qquad (\mathrm{since}\ f_k\perp (f-f_k))\\= & {} \frac{\Vert f_k\Vert ^2}{r_k M}. \end{aligned}$$

Substituting this inequality into the relation

$$\begin{aligned} \Vert f_{k+1}\Vert ^2=\Vert f_k\Vert ^2-|\langle f_k,B_k\rangle |^2, \end{aligned}$$

we have

$$\begin{aligned} \Vert f_{k+1}\Vert ^2\le \Vert f_k\Vert ^2\left( 1-\frac{1}{(r_kM)^2}\Vert f_k\Vert ^2\right) . \end{aligned}$$

By invoking Lemma 3.9, we have the desired estimation

$$\begin{aligned} \Vert f_k\Vert \le \frac{M^2}{\sqrt{1+\sum _{l=1}^k\frac{1}{r_l^2}}}. \end{aligned}$$

The proof is complete. \(\square \)

Remark 3.11

Since \(0<r_k\le 1,\) we have, at least

$$\begin{aligned} \Vert f_k\Vert \le \frac{M^2}{\sqrt{k}}. \end{aligned}$$

Coincidentally this is the same bound as for the Shannon expansion. Shannon expansion, however, treats bandlimited entire functions with great smoothness that is what is usually needed for good convergence rates. On the other hand, the Bergman space contains functions that blow up at the boundary.

Remark 3.12

With regards to the direct sum decomposition (3.7) what we can say in relation to Theorem 3.8 is that \(f\in \overline{\mathrm{span}}\{B_k\}_{k=1}^\infty \) with fast convergence, while the maximally selected sequence \(\mathbf{a}\) may or may not give rise to a basis.

4 The Bergman Space on the Upper Half Complex Plane

The theory on the upper half complex plane \(\mathbf{C}_+\) is a close analogy with the one for the unit disc. Denote by \(\mathbb {A}^2({\mathbf{C}_+})\) the square integrable Bergman space of the upper half complex plane \(\mathbf{C}_+.\) That is,

$$\begin{aligned} {\mathbb {A}}^2({\mathbf{C}_+})= & {} \Bigg \{f: {\mathbf{C}_+}\rightarrow \mathbf{C}\ | \ f\ \mathrm{is \ holomorphic \ in \ {\mathbf{C}_+}, \ and\ }\nonumber \\ \Vert f\Vert _{\mathbb {A}^2({\mathbf{C}_+})}^2= & {} \int _{\mathbf{C}_+} |f(z)|^2dA<\infty \Bigg \}. \end{aligned}$$

The reproducing kernel \(k^+_a\) of \(\mathbb {A}^2({\mathbf{C}_+})\) at the point \(a\in \mathbf{C}_+\) is given by \( k^+_a(z)=\frac{-1}{(z-\overline{a})^2}\) [33, 34]. Moreover, \( \Vert k^+_a\Vert ^2= k^+_a(a)=\frac{1}{(2\mathrm{Im}\{a\})^\mathrm{{2}}}\) and the normalized kernel \(e^+_{a}=\frac{k^+_a(a)}{\Vert k^+_a\Vert }=\frac{-2\mathrm{Im}\{a\}}{(z-\overline{a})^2},\) where \(\mathrm{Im}\{a\}\) denotes the imaginary part of a. In the later part of this section we will write, for simplicity but with abuse of notation, \(k^+_a, e^+_a,\) as \(k_a, e_a,\) etc.

Lemma 4.1

\(\mathrm{span }\{\frac{1}{(z-\overline{a})^2}\ |\ a\in {\mathbf{C}_+}\}\) is dense in \(\mathbb {A}^2({\mathbf{C}_+}).\)

Proof

Let \({\mathcal {A}}=\overline{\mathrm{span }}\{\frac{1}{(z-\overline{a})^2}\ |\ a\in \mathbf{C}_+\},\) we claim that \({\mathcal {A}}=\mathbb {A}^2({\mathbf{C}_+}).\) If this does not hold, then

$$\begin{aligned} \mathbb {A}^2({\mathbf{C}_+})={\mathcal {A}}\oplus {\mathcal {A}}^{\perp } \ \ \mathrm {and}\ \ {\mathcal {A}}^{\perp }\ne \{0\}. \end{aligned}$$

Thus, there exists \(f\in {\mathcal {A}}^{\perp }\) and \(f\ne 0.\) In such case due to the reproducing kernel property we have \(\langle f,\frac{1}{(z-\overline{a})^2}\rangle =f(a)=0\) for every \(a\in \mathbf{C}_+.\) It follows that \(f= 0,\) a contradiction. \(\square \)

Lemma 4.2

For any \(f\in \mathbb {A}^2({\mathbf{C}_+})\) there holds

$$\begin{aligned} \lim _{b\rightarrow \partial \mathbf{C}_+}|\langle f,e_{b}\rangle |=0, \end{aligned}$$

and there exists \(a\in \mathbf{{C}_+}\) such that

$$\begin{aligned} |\langle f,e_{a}\rangle |^2=\max \{ |\langle f,e_{b}\rangle |^2\ :\ b\in \mathbf{{C}_+}\}. \end{aligned}$$

Proof

The Cauchy-Schwarz inequality gives

$$\begin{aligned} |\langle f,e_b\rangle |\le \Vert f\Vert . \end{aligned}$$

Therefore \( |\langle f,e_b\rangle | \) has a finite upper bound. There exists a sequence \(\{b_k\}_{k=1}^\infty \) such that

$$\begin{aligned} \lim _{k\rightarrow \infty }|\langle f, e_{b_k}\rangle |=\sup \{|\langle f, e_b\rangle |\ |\ b\in \mathbf{{C}_+}\}. \end{aligned}$$
(4.1)

We will prove there exists a subsequence \(\{b_{k_l}\}_{l=1}^\infty \) converging to \({\tilde{b}}\in \mathbf{{C}_+}.\) Then we have \(|\langle f, e_{{\tilde{b}}}\rangle |=\lim _{l\rightarrow \infty }|\langle f, e_{{b_{k_l}}}\rangle |=\sup \{|\langle f, e_b\rangle |\ |\ b\in \mathbf{{C}_+}\},\) as desired. It suffices to show that for any \(\varepsilon >0\) there exists an open neighborhood \(B(\delta , R)\) of the boundary of \(\mathbf{C}^+,\) where \(B(\delta , R)=\{b\in \mathbf{C}^+\ |\ \mathrm{Im}\{b\} <\delta \}\cup \{b\in \mathbf{C}^+\ |\ |b|>R\},\) such that \(|\langle f, e_b\rangle |<\varepsilon \) whenever \(b\in B(\delta , R).\) As a consequence of this, as well as of the relation (4.1), when k grows large those \(b_k\) must stay in a compact subset of \(\mathbf{C}^+,\) and thus one can choose a subsequence of \(\{b_{k_l}\},\) according to the Bolzano–Weierstrass Theorem, converging to a point \({\tilde{b}}\) in \(\mathbf{C}^+.\)

For \(\varepsilon >0,\) since \(\mathrm{span }\{\frac{1}{(z-\overline{a})^2}\ |\ a\in \mathbf{C}_+\}\) is dense in \(\mathbb {A}^2({\mathbf{C}_+})\), there exists a function g of the form

$$\begin{aligned} g=\sum _{k=1}^{m}\frac{c_k}{(z-\overline{a}_k)^2}, \quad a_k\in {\mathbf{C}_+},\quad k=1,\ldots ,m, \end{aligned}$$

such that

$$\begin{aligned} \Vert f-g\Vert \le \varepsilon /2. \end{aligned}$$

As for the unit disc case, the triangle inequality and the Cauchy-Schwarz inequality give

$$\begin{aligned} |\langle f,e_b\rangle |\le |\langle g,e_b\rangle |+\varepsilon /2. \end{aligned}$$

On one hand, there exists \(\delta >0\) such that \(\mathrm{Im}\{b\} <\delta \) implies \(|\langle g,e_b\rangle |<\varepsilon /2.\) To show this, denote min\(\{\mathrm{Im}\{a_k\}\}_{k=1}^{m}=\delta _1>0,\) we have

$$\begin{aligned} |\langle g,e_b\rangle |= & {} 2\mathrm{Im}\{b\}\ |\sum _{k=1}^{m}\frac{c_k}{(b-\overline{a}_k)^{2}}|\\\le & {} 2\mathrm{Im}\{b\}\ \sum _{k=1}^{m}\frac{|c_k|}{|b-\overline{a}_k|^{2}}\\\le & {} 2\delta \ \sum _{k=1}^{m}\frac{|c_k|}{(\frac{\delta _1}{2})^{2}}\\< & {} \varepsilon /2, \end{aligned}$$

if \(\delta \) is small enough.

On the other hand, let \(\max \{|a_k|\}_{k=1}^{m}=R_1>0,\) we will show there exists R such that \(|b|>R\) implies \(|\langle g,e_b\rangle |<\varepsilon /2.\)

Let \(|b|>4R_1.\) There holds \(|b-\overline{a}_k|\ge |b|/2.\) Hence

$$\begin{aligned} |\langle g,e_b\rangle |\!\le \!2\mathrm{Im}\{b\}\ \sum _{k=1}^{m}\frac{|c_k|}{|b-\overline{a}_k|^{2}}\!\le \! 2|b|\ \sum _{k=1}^{m}\frac{4|c_k|}{|b|^2}\!=\!\frac{8}{|b|}\sum _{k=1}^{m}|{c_k}|. \end{aligned}$$

So, if \(|b|>R,\) and R is large enough and, in particular, larger than \(4R_1,\) then the last quantity is dominated by \(\varepsilon /2.\)\(\square \)

Note that \(\partial \mathbf{C}^+\) is defined to be \(\mathbf{R}\cup \mathbf{\infty },\) and open neighborhoods of \(\partial \mathbf{C}^+\) are \(B(\delta ,R)=\{z\in \mathbf{C}^+\ |\ \mathrm{Im} z<\delta ,\ \mathrm{or}\ |z|>R\}, \delta>0, R>0.\) By b being further close to \(\partial \mathbf{C}^+\) we mean that b is in \(B(\delta ',R')\) with \(\delta '<\delta ,\ R'>R.\) The above proved is regarded as boundary vanishing property of \(\langle f, e_b\rangle ,\) and denoted

$$\begin{aligned} \lim _{b\rightarrow \partial \mathbf{C}_+}|\langle f,e_b\rangle |=0. \end{aligned}$$

As in the unit disc Bergman space case we have the maximal selection principle on \(\mathbf{C}_+\):

Theorem 4.3

For any \(f\in \mathbb {A}^2({\mathbf{C}_+})\) and positive integer k there exists \(a_k\in \mathbf{{C}_+}\) such that

$$\begin{aligned} |\langle f,B^{a_k}_{k}\rangle |^2=\sup \{ |\langle f,B_{k}^b\rangle |^2\ :\ b\in \mathbf{{C}_+}\}, \end{aligned}$$

where for any \(b\in \mathbf{C}_+,\)\((B_1,\ldots ,B_{k-1},B^b_k)\) is the G–S orthogonalization of \((B_1,\ldots ,B_{k-1},{\tilde{k}}_b)\) corresponding to the k-tuple \((a_1,\ldots ,a_{k-1},b),\) where \({\tilde{k}}_b\) is defined similarly with (2.1).

Then we can develop the POAFD algorithm along with the adaptive rational orthogonal system of the Bergman space in the upper-half complex plane. With a proof similar with that of Theorem 3.8 we have

Theorem 4.4

Let f be any function in \(\mathbb {A}^2({\mathbf{C}_+}).\) Under maximal selections of the parameters \((a_1,\ldots ,a_n,\ldots )\) there holds

$$\begin{aligned} f=\sum _{k=1}^\infty \langle f,B_k\rangle B_k. \end{aligned}$$

5 Weighted Bergman Spaces \(\mathbb {A}^2_{\alpha }\) with \(-1<\alpha <\infty \)

In this section we study general weighted Bergman spaces \(\mathbb {A}^2_{\alpha }\ ,\ -1<\alpha <\infty .\) We adopt the notation

$$\begin{aligned} {\mathbb {A}}_\alpha ^2(\mathbf{D})= & {} \Bigg \{f: {\mathbf{D}}\rightarrow \mathbf{C}\ | \ f\ \mathrm{is \ holomorphic \ in \ \mathbf{D}, \ \mathrm and\ }\nonumber \\ \Vert f\Vert _{\mathbb {A}^2_{\alpha }(\mathbf{D})}^2= & {} \left. \int _\mathbf{D} |f(z)|^2dA_\alpha <\infty \right\} , \end{aligned}$$

where \(dA_\alpha (z)=(1+\alpha )(1-|z|^2)^{\alpha }dA(z).\) With the norm \(\Vert \cdot \Vert _{\mathbb {A}^2_{\alpha }(\mathbf{D})}^2\) the space \(\mathbb {A}^2_{\alpha }\) is a reproducing kernel Hilbert space [22]. The reproducing kernel and its norm are given, respectively, by

$$\begin{aligned} k_a^{\alpha }(z)=\frac{1}{(1-\overline{a}z)^{2+\alpha }}\ \ \hbox {and} \ \ \Vert k_a^{\alpha }\Vert ^{2}= k_a^{\alpha }(a)=\frac{1}{({1}-| a|^{2})^{{2}+\alpha }}. \end{aligned}$$
(5.1)

In this section we again denote by \((B_1,\ldots ,B_k)\) the G–S orthonormalization of \(({\tilde{k}}^\alpha _{a_1},\ldots ,{\tilde{k}}^\alpha _{a_k}),\) where \({\tilde{k}}^{\alpha }_{a_j}\) is defined as in (2.1) depending on the multiple of \(a_j\) in \((a_1,\ldots ,a_j),\) denoted \(l(a_j),\ 1\le j\le k.\)

If \(f(z)=\sum _{k=0}^{\infty }a_k z^k\in \mathbb {A}^2_{\alpha }, \) then a simple computation gives

$$\begin{aligned} \Vert f\Vert _{\mathbb {A}^2_{\alpha }}^2=\sum _{k=0}^{\infty }\frac{k!\Gamma (\alpha +2)}{\Gamma (k+\alpha +2)}|a_k|^2=\sum _{k=0}^{\infty }\frac{k!}{(\alpha +1+k)(\alpha +k)\cdots (\alpha +2)}|a_k|^2 \end{aligned}$$
(5.2)

(Also see [22]). When \(\alpha \) is an integer, it is identical with \(\sum _{k=0}^{\infty }\frac{\Gamma (\alpha +2)}{(k+\alpha +1)\cdots (k+1)}|a_k|^2\) and, when \(\alpha =0,\) it reduces to \(\sum _{k=0}^{\infty }\frac{1}{k+1}|a_k|^2,\) corresponding to the classical Bergman space. As in the proof of Lemma 3.6, the property \(\lim _{a\rightarrow \partial \mathbf{D}}\Vert k^\alpha _a\Vert \rightarrow \infty \) implies the BVP of \(\mathbb {A}^2_{\alpha }.\) With a proof similar to the classical case we can prove the maximal selection principle for \(\mathbb {A}^2_{\alpha }\):

Theorem 5.1

For any \(f\in \mathbb {A}^2_{\alpha }\) and positive integer k there exists \(a_k\in \mathbf{D}\) such that

$$\begin{aligned} |\langle f,B^{a_k}_{k}\rangle |^2=\sup \{ |\langle f,B_{k}^b\rangle |^2\ :\ b\in \mathbf{D}\}, \end{aligned}$$

where for any \(b\in \mathbf{D},\)\((B_1,\ldots ,B_{k-1},B^b_k)\) is the G–S orthonormalization of \((B_1,\ldots ,B_{k-1},{\tilde{k}}^{\alpha }_b)\) corresponding to the k-tuple \((a_1,\ldots ,a_{k-1},b).\)

The proof of Theorem 3.8 can be adapted to the weighted Bergman space cases. There thus holds

Theorem 5.2

Let f be any function in \(\mathbb {A}^2_{\alpha }.\) Under the maximal selections of the parameters \((a_1,\ldots ,a_n,\ldots )\) there holds

$$\begin{aligned} f=\sum _{k=1}^\infty \langle f,B_k\rangle B_k. \end{aligned}$$

6 Weighted Hardy Spaces \(\mathbb {H}^2_{\beta }\) with \(\beta >0\)

In the last section, we have discussed the weighted Bergman spaces, which correspond to the weighted Hardy space \(\mathbb {H}^2_{\beta }\) with \(\beta <0.\) Now we consider the weighted Hardy spaces \(\mathbb {H}^2_{\beta }\) with \(\beta >0.\) Let \(f=\sum \nolimits _{k=0}^{\infty }c_k z^k.\) Recall that the weighted Hardy spaces \(\mathbb {H}^2_{\beta },\ \beta >0,\) are defined as

$$\begin{aligned} \mathbb {H}^2_{\beta }(\mathbf{D})= & {} \Bigg \{f: {\mathbf{D}}\rightarrow \mathbf{C}\ | \ f\ \mathrm{is \ holomorphic \ in \ {\mathbf{D}}, \ and\ }\nonumber \\ \Vert f\Vert _{\mathbb {H}^2_{\beta }(\mathbf{D})}^2= & {} \left. \sum \limits _{k=0}^{\infty }(k+1)^{\beta }|c_k|^2<\infty \right\} , \end{aligned}$$

and the reproducing kernels are

$$\begin{aligned} k^\beta (a,z)= \sum \limits _{k=0}^{\infty }\frac{1}{(k+1)^{\beta }}(\overline{a}z)^k. \end{aligned}$$

Moreover, \(\Vert k^\beta (a,z)\Vert ^2=k^\beta (a,a)= \sum \nolimits _{k=0}^{\infty }\frac{1}{(k+1)^{\beta }}|a|^{2k}.\) As the former cases, denote by \(e_a^\beta \) the normalized reproducing kernel, that is \(e_a^\beta (z)=k^{\beta }(a,z)/\Vert k^{\beta }(a,z)\Vert .\) We note that for \(\beta >0\) the imbedding \(\mathbb {H}^2_{\beta }\subset {\mathbb {H}}^2\) is continuous.

We will show that for \(0<\beta \le 1\) the spaces \(\mathbb {H}^2_{\beta }\) enjoy the BVP, and as a consequence, have maximal selection principle; whereas for \(\beta >1\) they do not.

Lemma 6.1

Assume \(\beta \in (0,1].\) For any \(f\in \mathbb {H}^2_{\beta }\) there exists \(a\in \mathbf{D}\) such that

$$\begin{aligned} |\langle f,e_{a}\rangle |^2=\max \{ |\langle f,e_{b}\rangle |^2\ :\ b\in \mathbf{D}\}. \end{aligned}$$

Proof

The Cauchy-Schwarz inequality gives

$$\begin{aligned} |\langle f,e_b\rangle |\le \Vert f\Vert . \end{aligned}$$

Therefore \( |\langle f,e_b\rangle | \) has a finite upper bound.

For \(\varepsilon >0,\) since polynomials are dense in \(\mathbb {H}^2_{\beta }\), there exists a polynomial g such that

$$\begin{aligned} \Vert f-g\Vert \le \varepsilon . \end{aligned}$$

It follows

$$\begin{aligned} |\langle f,e_b\rangle |\le |\langle g,e_b\rangle |+\varepsilon = \frac{1}{\sqrt{\sum _{k=0}^{\infty }\frac{|b|^{2k}}{(k+1)^{\beta }}}}|g(b)|+\varepsilon . \end{aligned}$$

Since \(0<\beta \leqslant 1,\) the infinite series \(\sum _{k=0}^{\infty }\frac{|b|^{2k}}{(k+1)^{\beta }}\) is unbounded as \(|b|\rightarrow 1.\) Hence, the right hand side of the last inequality tends to \(\varepsilon \) as \(|b|\rightarrow 1.\) As a consequence, \(|\langle f,e_b\rangle |\) attains the maximum value at an interior point. \(\square \)

The following result shows that for \(\beta >1\) the space \(\mathbb {H}^2_{\beta }\) does not possess the BVP property.

Lemma 6.2

Assume \(\beta >1.\) Then for any non-trivial \(f\in \mathbb {H}^2_{\beta }\) there exists \(C_f>0\) and \(\{a_n\}_{n=1}^\infty \subset \mathbf{D}\) such that \(\lim \limits _{n\rightarrow \infty }|a_n|\rightarrow 1\) and

$$\begin{aligned} \overline{\lim \limits _{n\rightarrow \infty }} |\langle f, e_{a_n}\rangle |\ge C_f. \end{aligned}$$

Proof

For \(a\in \mathbf{D},\) set

$$\begin{aligned} A_\beta (a)=\sqrt{\sum _{k=0}^{\infty }\frac{|a|^{2k}}{(k+1)^{\beta }}}, \quad A_\beta =\sqrt{\sum _{k=0}^{\infty }\frac{1}{(k+1)^{\beta }}}, \quad \quad \alpha _0 =\Vert f\Vert _{{\mathbb {H}}^2}, \end{aligned}$$

and

$$\begin{aligned} 0<\varepsilon < \frac{\alpha _0}{2(1+A_\beta )}. \end{aligned}$$
(6.1)

Since \(\beta >1,\) the above quantities are well defined.

For such an \(\varepsilon \) one can find a polynomial \(g_\varepsilon \) such that \(\Vert f-g_\varepsilon \Vert _{\mathbb {H}^2_{\beta }}<\varepsilon ,\) and then for any \(a\in \mathbf{D},\)

$$\begin{aligned} |\langle f, e_a\rangle |\ge & {} |\langle g_\varepsilon , e_a\rangle _{\mathbb {H}^2_{\beta }} |-\varepsilon \nonumber \\= & {} \left| \frac{g_\varepsilon (a)}{A_\beta (a)}\right| -\varepsilon . \end{aligned}$$
(6.2)

Since the imbedding \(\mathbb {H}^2_{\beta }\subset {\mathbb {H}}^2\) is continuous,

$$\begin{aligned} \Vert f\Vert _{{\mathbb {H}}^2}-\Vert g_\varepsilon \Vert _{{\mathbb {H}}^2}\le \Vert f-g_\varepsilon \Vert _{{\mathbb {H}}^2}\le \Vert f-g_\varepsilon \Vert _{{\mathbb {H}}_\beta ^2}\le \varepsilon . \end{aligned}$$

There then follows

$$\begin{aligned} \left( \frac{1}{2\pi }\int _0^{2\pi }|g_\varepsilon (e^{it})|^2dt\right) ^{1/2}=\Vert g_\varepsilon \Vert _{{\mathbb {H}}^2}\ge \Vert f\Vert _{{\mathbb {H}}^2}-\varepsilon , \end{aligned}$$

where \(g_\varepsilon (e^{it})\) denotes the non-tangential boundary limit function of the Hardy space function \(g_\varepsilon \in {\mathbb {H}}^2\) on \(\partial \mathbf{D}.\) Among the points on the boundary on which the non-tangential boundary limit exists, there in particular exists a point \(e^{it_0}\) such that

$$\begin{aligned} |g_\varepsilon (e^{it_0})|\ge \Vert f\Vert _{{\mathbb {H}}^2}-\varepsilon , \end{aligned}$$

and a sequence of points \( a_n\in \mathbf{D}\) that non-tangentially tends to \(e^{it_0}\) and

$$\begin{aligned} \lim _{n\rightarrow \infty }g_\varepsilon (a_n)=g_\varepsilon (e^{it_0}). \end{aligned}$$

From (6.2) we obtain, for each n

$$\begin{aligned} |\langle f, e_{a_n}\rangle | \ge \left| \frac{g_\varepsilon (a_n)}{A_\beta (a_n)}\right| -\varepsilon . \end{aligned}$$

Taking limit \(n\rightarrow \infty \) on the above sequence of inequalities, we have

$$\begin{aligned} \overline{\lim }_{n\rightarrow \infty } |\langle f, e_{a_n}\rangle |\ge \frac{\Vert f\Vert _{{\mathbb {H}}^2}-\varepsilon }{A_\beta }-\varepsilon = \frac{\alpha _0-\varepsilon }{A_\beta }-\varepsilon \ge \frac{\alpha _0}{2A_\beta }, \end{aligned}$$

if \(\varepsilon <\frac{\alpha _0}{2(1+A_\beta )},\) being validated by (6.1). The proof of the lemma is now complete with \(C_f=\frac{\alpha _0}{2A_\beta }.\)\(\square \)

Based on Lemma 6.1 one can perform POAFD to \(\mathbb {H}^2_{\beta }\) for \(\beta \le 1\) and obtain an adaptive Fourier decomposition that converges at fast pace to the given function. In the Direchlet space case, for instance, corresponding to \(\beta =1/2,\) one can obtain POAFD type fast decomposition as linear combinations of the parameterized reproducing kernels as well as their one and higher order derivatives in general. For studies on the related zero-based invariant subspaces of the Dirichlet space the reader is referred to [24, 25].

For \(\beta >1,\) due to unavailability of BVP, proved in Lemma 6.2, one cannot directly perform POAFD. At this situation we can proceed with two strategies.

The first strategy makes use of the operator \({\mathfrak {D}}=\frac{d}{dz}\circ S,\) where S is the forward shift operator \(Sf(z)=zf(z).\) The operator \(\mathfrak {D}\) reduces the approximation of \(\mathbb {H}^2_{\beta }, \beta =[\beta ]+\beta ',\) where \([\beta ]\) denotes the maximal integer not exceeding \(\beta ,\) to the approximation in \({\mathbb {H}}^2_{\beta '}\) or in \({\mathbb {H}}^2_{\beta '-1},\) depending on \([\beta ]\) is an even or an odd integer, respectively. As a matter of fact, there holds, when \([\beta ]\) is even, \(f\in \mathbb {H}^2_{\beta }\) if and only if \({\mathfrak {D}}^{[\beta ]/2}f\in {\mathbb {H}}^2_{\beta '};\) and when \([\beta ]\) is odd, \(f\in \mathbb {H}^2_{\beta }\) if and only if \({\mathfrak {D}}^{([\beta ]+1)/2}f\in {\mathbb {H}}^2_{\beta '-1},\) the latter being a Bergman space. In each of the two cases, respectively in \({\mathbb {H}}^2_{\beta '}\) or \({\mathbb {H}}^2_{\beta '-1},\) the POAFD approximation is available. As a consequence, there exists a partial sum sequence \(Q_n,\) being linear combinations of a sequence of generalized reproducing kernels in the respective space, adaptively at a fast pace converges to \((\mathfrak {D})^{[\beta ]/2}f\) or to \((\mathfrak {D})^{([\beta ]+1)/2}f.\) As a result, \(\mathfrak {D}^{-[\beta ]/2}Q_n\) or \(\mathfrak {D}^{-([\beta ]+1)/2}Q_n\) converges to f. The final convergence is also adaptive in the energy sense and efficient. The operator \(\mathfrak {D}\) and \(\mathfrak {D}^{-1}\) can be computationally realized by either using the pair of Fourier multipliers \(\{(k+1)\}_{k=0}^\infty \) and \(\{(k+1)^{-1}\}_{k=0}^\infty \) or alternatively using the classical differentiation and integration.

As the second strategy, one can perform Weak-POAFD [10]. The latter, although not as efficient as POAFD at each step, also gives rise to highly efficient approximations. The difference is, in contrast with (3.12), for a given \(0<\rho <1,\) to find \(a_n\in {\mathbf{D}},\) not coinciding with any of the previously selected \(a_1, \ldots , a_{n-1},\) such that

$$\begin{aligned} |\langle f, B^{a_n}_n \rangle | \ge \rho \sup \{ |\langle f, B^{b}_n \rangle |\ :\ b\in {\mathbf{D}}\}, \end{aligned}$$

and we set \((B_1,\ldots ,B_{n-1},B_n)\) to be the Gram–Schmidt orthogonalization of \((B_1,\ldots , B_{n-1},e_{a_n}).\) Weak-POAFD has more available selections of the parameters.