Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Series expansions and integral representations of functions and operators play a fundamental role in the analysis of direct problems of applied mathematics—witness the role of power series, Fourier series, Karhunen–Loève expansion, eigenfunction expansions of symmetric linear operators, and sampling expansions in signal processing; and the role of Fourier transform, spectral integral representations, and various integral representations in boundary value problems, potential theory, complex analysis, and other areas.

Expansion theorems also play a fundamental role in inverse problems. Two important problems discussed below are: (1) the recovery of a function from inner product with a given set of functions (i.e., the moment problem), and (2) the recovery of a function from its values on a subset of its domain (i.e., reconstruction of a function from its samples via sampling expansion).

One of the important problems in analysis is to expand a given function f in a separable Banach space by a series of the form

$$f(t) = \sum\limits_{n=1}^{\infty }{c}_{ n}{f}_{n}(t),$$
(4.1)

where {f n } n = 1 is a suitable sequence of functions. This is not always possible. We consider two important cases of such an expansion:

  1. 1.

    Let H be a separable Hilbert space and {f n } n = 1 be an orthonormal sequence in H. If {c n } n = 1 is a sequence for which the right-hand side of the expansion (4.1) converges to f in H, then c n  = ⟨f, f n ⟩ for all n ≥ 1, and hence c n , n ≥ 1, are the (generalized) Fourier coefficients. In this case, the series

    $$f(t) = \sum\limits_{n=1}^{\infty }\langle f,{f}_{ n}\rangle {f}_{n}(t)$$
    (4.2)

    is called the (generalized) Fourier expansion. This series converges to f if and only if the orthonormal sequence {f n } is complete in the sense that the only function orthogonal to all the f n ’s is the zero function. Equivalently, Parseval equality

    $$ \sum\limits_{n=1}^{\infty }\vert \langle f,{f}_{ n}\rangle {\vert }^{2} =\| {f\|}^{2}$$

    holds for every f ∈ H.

    This expansion theorem is a direct problem: Given f, find its expansion. The associated inverse problem is the moment problem: determine f given moments ⟨f, f n ⟩, n ∈ J (an index set). Given any sequence of real number s n , n ∈ J, the existence of the moment problem is whether there exists a function f such that s n  = ⟨f, f n ⟩, n ∈ J, while the uniqueness is whether such a function f is uniquely determined by its moment sequence s n , n ∈ J.

  2. 2.

    The second type of expansion that is the central theme of this chapter, is what is called a sampling expansion:

    $$f(t) = \sum\limits_{n=1}^{\infty }f({t}_{ n}){S}_{n}(t),$$
    (4.3)

    where {S n } n = 1 is called a sampling sequence and {t n } n = 1 is the sampling set. The inverse problem for sampling is to determine f from given samples f(t n ), n ≥ 1.

There have been many advances in sampling theory and its applications to signal and image processing. In the past three decades, many authors developed sampling theorems based on (1) the theory of regular and singular boundary value problems and (2) transforms other than the Fourier transform, including such transforms as the Sturm–Liouville transform, Jacobi transform, and Hankel transform, see [43]. Another main thrust has been in nonuniform sampling for non-bandlimited signals.

In the past 20 years, there have been major advances in sampling theory and its foundational aspects, where methods of functional analysis and harmonic analysis have played pivotal roles. In particular, new directions in sampling theory have been pursued using various function spaces that admit sampling expansions, such as reproducing kernel Hilbert (Banach) spaces, Sobolev spaces, Wiener amalgam space, shift-invariant spaces, translation-invariant spaces, and spaces modeling signals with finite rate of innovation. Another direction of research on sampling in the past decade involves average sampling, convolution sampling, nonlinear sampling, and other fundamental issues in sampling theory. The reader may refer to [1, 6, 7, 911, 1524, 30, 33, 37, 39, 43] for various prospectives on these advances.

The purpose of this chapter is to consider a variety of function spaces mentioned above in which every function admits the sampling expansion (4.3). Representative sampling expansions are presented for signals in each of these spaces. This chapter also includes recent results on nonlinear sampling for signals with finite rate of innovation, convolution sampling on Banach spaces, and certain foundational issues in sampling expansions.

2 Fourier Series/Fourier Integral Approach to the Whittaker, Shannon, andKotel’nikov Sampling Theorem

Let f be a bandlimited signal with finite energy; i.e.,

$$f(t) ={ \int}_{-\Omega }^{\Omega }F(\omega ){\mathrm{e}}^{i\omega t}\mathrm{d}\omega,\ t \in (-\infty,\infty ),$$
(4.4)

for some square-integrable function F on [ − Ω, Ω], Ω > 0. We extend F(ω) periodically to the real line and expand the extension in complex Fourier series:

$$F(\omega ) = \sum\limits_{n=-\infty }^{\infty }{c}_{ n}\exp (in\pi \omega /\Omega ),\ \vert \omega \vert < \Omega,$$
(4.5)

where

$${c}_{n} = \frac{1} {2\Omega }{\int}_{-\Omega }^{\Omega }F(\omega )\exp (-in\pi \omega /\Omega )\mathrm{d}\omega.$$
(4.6)

Comparing (4.4) and (4.6) leads to

$${c}_{n} = \frac{1} {2\Omega }f(-n\pi /\Omega ).$$
(4.7)

Substituting (4.7) in (4.5) gives

$$\begin{array}{rcl} F(\omega )& =& \frac{1} {2\Omega } \sum\limits_{n=-\infty }^{\infty }f(-n\pi /\Omega )\exp (in\pi \omega /\Omega ) \\ & =& \frac{1} {2\Omega } \sum\limits_{n=-\infty }^{\infty }f(n\pi /\Omega )\exp (-in\pi \omega /\Omega ),\ \vert \omega \vert < \Omega \end{array}$$

Substituting this in (4.4) and interchanging the order of integration and summation leads to

$$f(t) = \frac{1} {2\Omega } \sum\limits_{n=-\infty }^{\infty }f(n\pi /\Omega ){\int}_{-\Omega }^{\Omega }\exp (-in\pi \omega /\Omega )\exp (i\omega t)\mathrm{d}\omega,$$

which yields the celebrated classical expansion of a bandlimited signal f:

$$f(t) = \sum\limits_{n=-\infty }^{\infty }f(n\pi /\Omega )\frac{\sin (\Omega t - n\pi )} {\Omega t - n\pi }.$$
(4.8)

This can be simplified to

$$f(t) =\sin (\Omega t) \sum\limits_{n=-\infty }^{\infty }f(n\pi /\Omega ) \frac{{(-1)}^{n}} {\Omega t - n\pi }.$$

This classical proof is rigorous. The interchange of integration and summation can be easily justified. But the proof is not very revealing: We perform this interchange and a theorem pops up. A theorem is born, but we did not hear the heartbeat of its proof.

3 Properties of the Sinc Function and the Paley–Wiener Space

Let B π consist of all signals that are bandlimited to [ − π, π] (i.e., Ω = π in (4.4)) and have finite energy. The space B π is the same as the Paley–Wiener space of restrictions to the real axis of entire functions of exponential type π. The Paley–Wiener space B π has many interesting properties that are not exploited or used in the classical proof of the Whittaker–Shannon–Kotel’nikov sampling theorem. Some of these properties are stated in the following theorem.

Theorem 1.

  • B π is a reproducing kernel space with kernel

    $$\mathrm{sinc}(s - t) := \frac{\sin \pi (s - t)} {\pi (s - t)}.$$
  • The sequence {S n } n∈Z , where \({S}_{n}(t) :=\mathrm{ sinc}(t - n)\) , is an orthonormal basis for B π .

  • The sequence {S n } n∈Z has the discrete orthogonality property:

    $${S}_{n}(m) = {\delta }_{mn} := \left \{\begin{array}{ll} 1&\ \mathrm{if}\ m = n,\\ 0 &\ \mathrm{if }\ m\neq n.\end{array} \right.$$
  • f(⋅− c) ∈ B π and \(\|f{(\cdot - c)\|}_{2} =\| {f\|}_{2}\) for all f ∈ B π and c ∈ R. Hence B π is unitarily translation-invariant subspace of L 2(R).

  • B π is a shift-invariant subspace of L 2(R) generated by the sinc function:

    $${B}_{\pi } = \left \{ \sum\limits_{n\in \mathbf{Z}}c(n)\ \mathrm{sinc}(t - n) :\ \sum\limits_{n\in \mathbf{Z}}\vert c(n){\vert }^{2} < \infty \right \}.$$

Proof.

  1. (i)

    Take f ∈ B π, and let F be the square-integrable function in (4.4). Then

    $$f(t) ={ \int}_{-\pi }^{\pi }{\mathrm{e}}^{it\omega }F(\omega )\mathrm{d}\omega,\ t \in (-\infty,\infty ).$$
    (4.9)

    This implies that

    $$\vert f(t)\vert \leq {\int}_{-\pi }^{\pi }\vert F(\omega )\vert d\omega \leq {(2\pi )}^{1/2}\|{F\|}_{ 2} =\| {f\|}_{2}$$

    and

    $$\begin{array}{rcl} f(t)& =& \frac{1} {2\pi }{\int}_{-\pi }^{\pi }{\mathrm{e}}^{it\omega }\left ({\int}_{-\infty }^{\infty }f(s){\mathrm{e}}^{-is\omega }\mathrm{d}s\right )\mathrm{d}\omega \\ & =& {\int}_{-\infty }^{\infty }f(s)\left ( \frac{1} {2\pi }{\int}_{-\pi }^{\pi }{\mathrm{e}}^{i(t-s)\omega }\mathrm{d}\omega \right )\mathrm{d}s \\ & =& {\int}_{-\infty }^{\infty }f(s)\frac{\sin \pi (s - t)} {\pi (s - t)}\mathrm{d}s,\ t \in (-\infty,\infty )\end{array}$$

    Hence B π is a reproducing kernel Hilbert space with kernel \(\frac{\sin \pi (s-t)} {\pi (s-t)}\).

  2. (ii)

    By the reproducing property and symmetry of the sinc kernel \(k(s,t) := \frac{\sin \pi (s-t)} {\pi (s-t)}\), we have that ⟨S n , S m ⟩ = k(n, m), which takes value one if m = n and zero if mn. Hence S n , n ∈ Z, is an orthonormal set. The completeness of the orthonormal set {S n } n ∈ Z follows from (4.8).

  3. (iii)

    The discrete orthogonality property is obvious.

  4. (iv)

    Take f ∈ B π, and let F be the square-integrable function in (4.4). Then it follows from (4.9) that for any c ∈ ( − , ),

    $$f(t - c) = \frac{1} {2\pi }{\int}_{-\pi }^{\pi }{\mathrm{e}}^{i(t-c)\omega }F(\omega )\mathrm{d}\omega = \frac{1} {2\pi }{\int}_{-\pi }^{\pi }{\mathrm{e}}^{it\omega }{F}_{ c}(\omega )\mathrm{d}\omega,$$

    where \({F}_{c}(\omega ) ={ \mathrm{e}}^{-ic\omega }F(\omega )\) is square-integrable. Hence f(t − c) ∈ B π for all t ∈ ( − , ).

  5. (v)

    This follows from the conclusion that {S n } n ∈ Z is an orthonormal basis for the Paley–Wiener space B π.

4 The Engineering Approach to Sampling and Its MathematicalDeficiencies

We now turn to the engineering clever approach to the sampling theorem, see, e.g., [13, 30, 32]. We paraphrase some of the description in [13]. Let us consider what happens when we sample f(t) at uniformly spaced times. If the sampling frequency is f s , then we can model this with a multiplication of f(t) by a train of Dirac impulsed spaces \({T}_{s} := 1/{f}_{s}\) second apart:

$${f}^{{_\ast}}(t) := f(t){T}_{ s} \sum\limits_{n=-\infty }^{\infty }\delta (t - n{T}_{ s}) = {T}_{s} \sum\limits_{n=-\infty }^{\infty }f(n{T}_{ s})\delta (t - n{T}_{s}).$$

Mathematicians consider the sequence of sampled values \(\{f{(n{T}_{s})\}}_{n=-\infty }^{\infty }\) as a vector in 2 (the space of all square-summable vectors). Electrical engineers like to continue to think of this sequence as a time signal. So to stay in the analog world, they use their beloved “Dirac impulse” as above. Informally, as they assert, “multiplication by an impulse train” in the time domain corresponds to convolution with an impulse train in the frequency domain. If the Fourier transform of the sampled sequence f  ∗ (t) is F  ∗ (ω), then

$${F}^{{_\ast}}(\omega ) = F(\omega ) {_\ast} \sum\limits_{n=-\infty }^{\infty }\delta (\omega - n{\omega }_{ s}) = \sum\limits_{n=-\infty }^{\infty }F(\omega - n{\omega }_{ s}).$$

Hence, again informally, the Fourier transform of the samples (considered as a time signal in the sense of the above representation) is an infinitely repeated replication of F(ω) at intervals of ω s : = 2πf s . The portion of the transform between \(-{\omega }_{s}/2\) and ω s  ∕ 2 is called the base band and all the other replication images. If f(t) is bandlimited so that F(ω) is zero for | ω |  > 2πf c , and if f s  ≥ 2f c (Shannon’s rate), then there is no overlap between successive replications. We have lost no information in the sampling process; if anything, we have picked up a lot of “superfluous information” at frequencies outside the range of interest.

To recover the original signal, we must remove the replication images. First F(ω) can be obtained from F  ∗ (ω) by multiplying it by the characteristic function

$${\chi }_{{\omega }_{s}/2}(\omega ) = \left \{\begin{array}{ll} 1&\ \mathrm{if}\ \vert \omega \vert \leq {\omega }_{s}/2, \\ 0&\ \mathrm{if}\ \vert \omega \vert > {\omega }_{s}/2. \end{array} \right.$$

This is done by an analog filter known as an interpolation filter or a low-pass filter. We are now back to F(ω), and the time function can be recovered via the inverse Fourier transform:

$$f(t) ={ \int}_{-\infty }^{\infty }F(\omega ){\mathrm{e}}^{i\omega t}\mathrm{d}\omega.$$

In essence engineers view the Shannon sampling theorem in terms of an impulse train and a low-pass filter in the following way [30, 32]:

$$f(t)\mathop{\longmapsto }\limits^{\mathrm{sampler}}{f}^{{_\ast}}(t)\mathop{\longmapsto }\limits^{\mathrm{low\ pass\ filter}}f(t).$$

First the signal f ∈ B π is sampled at the integers to convert f(t) into the impulse train

$${f}^{{_\ast}}(t) = \sum\limits_{n=-\infty }^{\infty }f(n)\delta (t - n),$$

where δ(t − n) is the Dirac delta function (impulse) at t = n. This is still expressed as an analog signal. This then is transmitted through an ideal low-pass filter which passes all frequencies of absolute value less than π and blocks all others. This converts f  ∗ (t) back into f(t).

We denote the sampling map by S : ff  ∗  and the low-pass filter map by P : f  ∗ f

$${B}_{\pi }\mathop{\longmapsto }\limits^{\mathrm{S}} \boxed{ \text{unknown space}} \mathop{\longmapsto }\limits^{\mathrm{P}}{B}_{\pi }.$$

But the above procedure has some mathematical difficulties which are not resolved in the engineering approach:

  • The sampler S takes f into f  ∗ , which is out of the space of bandlimited functions; indeed, f  ∗  is not a signal with finite energy.

  • In what sense does the impulse train series converge? One may prove that the series converges in the sense of tempered distributions to a generalized function.

  • The map P recovers f at least formally since

    $$P(\delta (t - n)) = \frac{1} {2\pi }{\int}_{-\pi }^{\pi }{\mathrm{e}}^{-i\omega t}{\mathrm{e}}^{i\omega n}\mathrm{d}\omega = \frac{\sin \pi (t - n)} {\pi (t - n)}.$$

    If P is continuous (in some topology), then

    $$(P{f}^{{_\ast}})(t) = \sum\limits_{n=-\infty }^{\infty }f(n)P(\delta (t - n)) = \sum\limits_{n=-\infty }^{\infty }f(n)\frac{\sin \pi (t - n)} {\pi (t - n)}.$$

    However, since \(\mathcal{S}^{\prime}\) (the space of all tempered distributions, see the next section) is not a Hilbert space, we do not know if P is a continuous operator.

  • Still another difficulty! P is not even well defined on \(\mathcal{S}^{\prime}\). Indeed,

    $$Pg = {\mathcal{F}}^{-1}({\chi }_{ [-\pi,\pi ]}\hat{g}),\ g \in {\mathcal{S}}^{{\prime}},$$

    where \({\mathcal{F}}^{-1}\) is the inverse Fourier transform of a tempered distribution. So P corresponds under the Fourier transform to the multiplication of the Fourier transform \(\hat{g}\) in \({\mathcal{S}}^{{\prime}}\) by the characteristic function of [ − π, π]. Unfortunately the characteristic function is not a multiplier in \({\mathcal{S}}^{{\prime}}\). Hence we need to restrict ourselves to a subspace of \({\mathcal{S}}^{{\prime}}\) in which χ[ − π, π] is a multiplier.

These issues have been resolved by Nashed and Walter [28]. They obtained a rigorous proof of the engineering approach, by considering the sampling map S as an operator from B π to H  − 1 and the filtering map P as an operator (actually an orthogonal projection) from H  − 1 to B π:

$${B}_{\pi }\mathop{\longmapsto }\limits^{\mathrm{S}}{H}^{-1}\mathop{\longmapsto }\limits^{\mathrm{P}}{B}_{ \pi },$$

where H  − 1 is a Sobolev space, see the next section for the definition of Sobolev spaces. More importantly, by emulating and extending this proof, they obtained a general unifying approach for sampling theorems in reproducing kernel Hilbert spaces (RKHS) that include many earlier sampling theorems.

5 Function Spaces: Tempered Distributions, Sobolev Spaces and ReproducingKernel Hilbert Spaces

5.1 The Space of Tempered Distributions

Let \(\mathcal{S}\) be the space of all rapidly decreasing C functions on the real line R, i.e., functions that satisfy

$$\vert {g}^{(k)}(t)\vert \leq {C}_{ p,k}{(1 + \vert t\vert )}^{-p},\ t \in \mathbf{R},$$

for all p, k = 0, 1, 2, . Convergence on \(\mathcal{S}\) may be defined by endowing \(\mathcal{S}\) with the seminorms:

$${\mu }_{p,k} :{=\sup }_{t\in \mathbf{R}}{(1 + \vert t\vert )}^{p}\vert {g}^{(k)}(t)\vert.$$

Then g n  → g in \(\mathcal{S}\) whenever

$${(1 + \vert t\vert )}^{p}\left({g}_{ n}^{(k)}(t) - {g}^{(k)}(t)\right) \rightarrow 0$$

uniformly in t ∈ R for each p and k ≥ 0 as n → . The set \(\mathcal{S}\) is dense in L 2 : = L 2(R) (the space of all square-integrable functions on the real line). We observe that compactly supported C functions are contained in \(\mathcal{S}\), and that the space \(\mathcal{S}\) is complete with respect to the convergence of semi-norms μ p, k , p, k ≥ 0.

A tempered distribution is an element in the dual space \({\mathcal{S}}^{{\prime}}\) of \(\mathcal{S}\), i.e., \({\mathcal{S}}^{{\prime}}\) consists of all continuous linear functionals on \(\mathcal{S}\). The definition of Fourier transform

$$\mathcal{F}f(\omega ) :={ \int}_{-\infty }^{\infty }f(t){\mathrm{e}}^{-it\omega }\mathrm{d}t,\quad f \in \mathcal{S}$$

may be extended from \(\mathcal{S}\) to \({\mathcal{S}}^{{\prime}}\). The following examples of Fourier transforms on \({\mathcal{S}}^{{\prime}}\) are needed in the derivation of a rigorous setting for the proof of the engineering approach:

$$\mathcal{F}\left(\delta (t - \alpha )\right) ={ \mathrm{e}}^{-i\omega \alpha },$$
$$\mathcal{F}\left( \sum\limits_{n\in \mathbf{Z}}\delta (t - 2\pi n)\right) = \sum\limits_{n\in \mathbf{Z}}\delta (\omega - n),$$
$$\mathcal{F}\left( \sum\limits_{n\in \mathbf{Z}}{a}_{n}{\mathrm{e}}^{int}\right) = 2\pi\sum\limits_{n\in \mathbf{Z}}{a}_{n}\delta (\omega - n),$$

and

$$\mathcal{F}\left( \sum\limits_{n\in \mathbf{Z}}{a}_{n}{\mathrm{e}}^{int}{\chi }_{ [-\pi,\pi ]}(t)\right) = \sum\limits_{n\in \mathbf{Z}}{a}_{n}\frac{\sin \pi (\omega - n)} {\pi (\omega - n)},$$

where {a n } n ∈ Z  ∈  2.

5.2 Sobolev Spaces

An important Hilbert space structure on certain subsets of \(\mathcal{S}^{\prime}\) is provided by a class of Sobolev spaces. For r ∈ R, the Sobolev space H r consists of all tempered distributions \(f \in {\mathcal{S}}^{{\prime}}\) such that

$${\int}_{-\infty }^{\infty }\vert \hat{f}(\omega ){\vert }^{2}{({\omega }^{2} + 1)}^{r}\mathrm{d}\omega < \infty.$$

The inner product of f and g in H r is defined by

$$\langle f,{g\rangle }_{r} :={ \int}_{-\infty }^{\infty }\hat{f}(\omega )\overline{\hat{g}(\omega )}{({\omega }^{2} + 1)}^{r}\mathrm{d}\omega,$$

where \(\hat{f} := \mathcal{F}f\) is the Fourier transform of f. The space H r is complete with respect to this inner product. For r = 0, H 0 is just L 2(R) by Parseval identity. For r = 1, 2, , H r is the usual Sobolev space of functions that are (r − 1)-times differentiable and whose rth derivative is in L 2(R). For \(r = -1,-2,\ldots \), H r contains all tempered distributions with point support of order r. Thus the Dirac delta δ ∈ H  − 1, and δ, the distributional derivative of δ, belongs to H  − 2.

5.3 Reproducing Kernel Hilbert Spaces

A Hilbert space H of complex-valued functions on a set Ω is called a RKHS if all the evaluation functionals H ∋ ff(t) ∈ C are continuous (bounded) for each fixed t ∈ Ω; i.e., there exist a positive constant C t for each t ∈ Ω such that \(\vert f(t)\vert \leq {C}_{t}\|f\|\) for all f ∈ H.

By Riesz representation theorem, for each t ∈ Ω there exists a unique element k t such that f(t) = ⟨f, k t ⟩ for all f ∈ H. The reproducing kernelk( ⋅,  ⋅) : Ω ×ΩC of a RKHS H is defined by k(s, t) = ⟨k s , k t ⟩, s, t ∈ Ω.

We summarize some basic properties of RKHS that are particularly relevant to signal processing, wavelet analysis, and approximation theory:

  • \(k(s,t) = \overline{k(t,s)}\) for all t, s ∈ Ω.

  • k(s, s) ≥ 0 for all s ∈ Ω. Furthermore, if k(t 0, t 0) = 0 for some t 0 ∈ Ω, then f(t 0) = 0 for all f ∈ H.

  • \(\vert k(s,t)\vert \leq \sqrt{k(s, s)}\sqrt{k(t, t)}\) for all s, t ∈ Ω.

  • The reproducing kernel k(s, t) on Ω ×Ω is a nonnegative definite symmetric kernel. Conversely by Aronszajn–Moore theorem, every nonnegative definite symmetric function k( ⋅,  ⋅) on Ω ×Ω determines a unique Hilbert space H k for which k( ⋅,  ⋅) is a reproducing kernel [5]. Here a complex-valued function F on Ω ×Ω is said to be positive definite if for any n points t 1, , t n  ∈ Ω, the matrix A : = (F(t i , t j ))1 ≤ i, j ≤ n is nonnegative definite, i.e., \({u}^{{_\ast}}Au = \sum\limits_{i,j=1}^{n}\overline{{u}_{i}}F({t}_{i},{t}_{j}){u}_{j} \geq 0\) for all u = (u 1, , u n ) ∈ C n.

  • A closed subspace \(\tilde{H}\) of a RKHS H is also a RKHS. Moreover, the orthogonal projector P of H onto \(\tilde{H}\) and the reproducing kernel \(\tilde{k}(s,t)\) of the RKHS \(\tilde{H}\) are related by \(Pf(s) =\langle f,\tilde{{k}}_{s}\rangle,s \in \Omega \) for all f ∈ H where \(\tilde{{k}}_{s} = P{k}_{s}\).

  • If a RKHS space H with kernel k( ⋅,  ⋅) has direct orthogonal decomposition H = H 1 ⊕ H 2 for some complementary orthogonal closed subspaces H 1 and H 2, then \(k = {k}_{1} + {k}_{2}\), where k 1, k 2 are reproducing kernels of the reproducing kernel Hilbert spaces H 1 and H 2, respectively.

  • In a RKHS, the element representing a given bounded linear functional ϕ can be expressed by means of the reproducing kernel: ϕ(f) = ⟨f, h⟩, where h = ϕ(k). Similarly for a bounded linear operator L on H to H, we have that \(Lf(t) =\langle Lf,h\rangle =\langle f,{L}^{{_\ast}}h\rangle\).

  • Every finite-dimensional function space is a RKHS H with reproducing kernel \(k(s,t) = \sum\limits_{i=1}^{n}{u}_{i}(s)\overline{{u}_{i}(t)}\), where {u i } i = 1 n is an orthonormal basis for H. (Notice that the sum in the above definition of the kernel k is invariant under the choice of orthonormal basis).

  • The space W 1, that contains all functions f ∈ L 2[0, 1] such that f is absolutely continuous and f′, which exists almost everywhere, is in L 2[0, 1], and f(0) = 0, is a RKHS with kernel k(s, t) = min(s, t) under the inner product ⟨f, g⟩ =  ∫0 1 f(t)g(t)dt.

  • Sobolev space H s, s > 1 ∕ 2, is a reproducing kernel Hilbert space.

  • Let H be a separable RKHS, then its reproducing kernel k( ⋅,  ⋅) has the expansion:

    $$k(s,t) = \sum\limits_{n=1}^{\infty }{\varphi }_{ n}(t)\overline{{\varphi }_{n}(s)},$$

    where {ϕ n } n = 1 is an orthonormal basis for H. We remark that for a general separable Hilbert space H, \( \sum\limits_{n=1}^{\infty }{\varphi }_{n}(t)\overline{{\varphi }_{n}(s)}\) is not a reproducing kernel and also that ϕ n ’s do not generally corresponds to sampling expansions. If they do, i.e.,if φ n (t) = k(t n , t) for some sequence {t n }, then we have that \(f(t) = \sum\limits_{n=1}^{\infty }f({t}_{n}){\varphi }_{n}(t)\), this constitutes a sampling theorem.

  • If the reproducing kernel k(s, t) of a RKHS H is continuous on Ω ×Ω, then H is a space of continuous functions (uniformly continuous on a bounded Ω). This follows from

    $$\vert f(t) - f(s)\vert = \vert \langle f,{k}_{t} - {k}_{s}\rangle \vert \leq \| f\|\|{k}_{t} - {k}_{s}\|$$

    and \(\|{k}_{t} - {k{}_{s}\|}^{2} = k(t,t) - 2k(t,s) + k(s,s)\) for all s, t ∈ Ω.

  • Strong convergence in a RKHS H implies pointwise convergence and uniform convergence on compact sets, because

    $$\vert f(t) - {f}_{n}(t)\vert = \vert \langle f - {f}_{n},{k}_{t}\rangle \vert \leq \| f - {f}_{n}\|\sqrt{k(t, t)}.$$
  • L 2[a, b], the space of all square-integrable functions on the interval [a, b], is not a RKHS. Indeed, point evaluation is not well defined. Each function f ∈ L 2[a, b] is actually an equivalence class of functions equal to each other almost everywhere. Thus the“value” at a point has no meaning since any point has measure zero.

6 Rigorous Justification of the Engineering Approach to Sampling

This section involves a search for function spaces in which the mathematical difficulties described in Sect. 4.3 are resolved. Clearly we want to work with a subspaceH of the space \({\mathcal{S}}^{{\prime}}\) of tempered distributions. We require that δ ∈ H and the convolution signal of the impulse train must converge in H under a mild condition on \(\{f{({t}_{n})\}}_{n=-\infty }^{\infty }\). As remarked in Sect. 4.4, the characteristic function is not a multiplier in the space \({\mathcal{S}}^{{\prime}}\), and hence the space H must also have the property that the characteristic function is a multiplier in H. Finally, the sampling map S and the low-pass filter map P must be well defined on the appropriate spaces, and their composition PS is the identity map:

  • The characteristic functions are multipliers in the space of Fourier transforms of the elements in H r(R), which can be identified with the image under the Fourier transform of L r 2(R), the space of square-integrable functions with respect to the measure \(d\mu (x) = {(1 + \vert x{\vert }^{2})}^{r}\mathrm{d}x\). So we may consider H r with \(r > -1/2\); specifically we take H  − 1, since δ ∈ H  − 1.

  • We consider the sampling map Sf = f  ∗  as a map onto H  − 1. Then the partial sums of the sampled impulse train \( \sum\limits_{n=-N}^{N}f(n)\delta (t - n)\) belong to H  − 1 and converges in the norm of H  − 1 to f  ∗ . This proposition does not require the signal to be bandlimited, but the next result does.

  • When we consider projector onto the space H  − 1, B π ⊂ L 2 ⊂ H  − 1 and B π is closed in the topology of H  − 1. Hence we can define the orthogonal projection of H  − 1 onto B π. The reproducing kernel of B π enables us to compute this projection easily. In fact,

    $$P{\delta }_{a}(t) = k(t,a),\ t \in \mathbf{R}$$

    and

    $$P{f}^{{_\ast}} {=\lim }_{ N\rightarrow \infty } \sum\limits_{n=-N}^{N}f(n)k(\cdot,n)$$

    in the norm of H  − 1 [28]. The above series converges to Pf  ∗ (t) by the continuity of the orthogonal projector and to f(t) by the sampling theorem. Thus Pf  ∗  = f.

The above ideas provide a mathematical proof of the engineering arguments, but they also suggest important extensions to other RKHS, as discussed in [28].

7 Sampling in Reproducing Kernel Hilbert Spaces

At the outset, the expansions (4.2) and (4.3) mentioned in introduction section appear markedly different, or at least not seem to be related. The expansion (4.2) holds for any complete orthonormal sequence in a separable Hilbert space, and sampling points {t n } could have no meaning in this context. On the other hand, the sampling expansion does not require the sampling sequence {S n (t)} n = 1 to be orthonormal.

The expansions (4.2) and (4.3) can indeed be related. In finite-dimensional spaces the expansions are based on different choices of orthonormal basis. One with respect to inner product of two continuous functions, and the other is based on discrete orthogonality or biorthogonality of the sequences.

The expansions (4.2) and (4.3) may also be related in some special cases of orthonormal sequences in certain infinite-dimensional spaces. Let f be a signal defined on the real line. Suppose that there exists a reproducing kernel k(t, s) such that for some real numbers {t n } n = 1 , the sequence f n (s) = k(t n , s), n ≥ 1, is a complete orthonormal sequence, then \({c}_{n} =\langle f,{f}_{n}\rangle = f({t}_{n})\). The series expansion ∑ n = 1 c n f n then becomes a sampling expansion, i.e., it states how to recover f(t) from the sample values {f(t n )} n = 1 . For example, for π-bandlimited signals, \(k(t,s) = \frac{\sin \pi (t-s)} {\pi (t-s)}\) and S n (t) = k(t, n) is an orthonormal basis for B π, and the expansion result mentioned above reduces to the Whittaker–Shannon–Kotel’nikov sampling expansion [42].

In [28], the authors introduced an approach that provided general sampling theorem for functions in RKHS. The sampling theorems are for functions in a general RKHS with reproducing kernel k(t, s), which is closed in the Sobolev space \({H}^{-p},p > 1/2\). The sampling functions S n , n ≥ 1, need not be an orthogonal system, and the theory allows nonorthogonal sampling sequences. Then the system \(\{{S}_{n} := k{({t}_{n},\cdot )\}}_{n=1}^{\infty }\) has to satisfy a biorthogonality condition, i.e., S n (t m ) = δ mn for all m, n ≥ 1, and the sampling points must satisfy a density-type condition. This general setup includes sampling theorems related to other transforms than the Fourier transform as in the classical theory. For example, Sturm–Liouville, Jacobi, and Laguerre transforms are among the examples discussed in [28], as well as sampling using frames. Also for the orthogonal case, several error analyses, such as truncation, aliasing, jittering, and amplitude error, are discussed in details.

Now we state a representative sampling theorem for signals in a reproducing kernel Hilbert space [28].

Theorem 2.

Let H ⊂ L 2(R) be a reproducing kernel Hilbert space that is closed in the Sobolev space H−1 and under differentiation. Assume that its reproducing kernel k(s,t) is continuous and has the zero sequence {tn} which is a set of uniqueness for H, and assume that {tn} tends to infinity as n tends to infinity. If f ∈ H satisfies \(f(t)/k(t,t) = O({t}^{-2})\), then the sampled sequence

$${f}^{{_\ast}}(t) = \sum\limits_{n} \frac{f({t}_{n})} {k({t}_{n},{t}_{n})}\delta (t - {t}_{n})$$

converges in the sense of H−1 and its orthogonal projection onto H equals to f(t), and the series

$$f(t) = \sum\limits_{n} \frac{f({t}_{n})} {k({t}_{n},{t}_{n})}k({t}_{n},t)$$

converges uniformly on sets for which k(t,t) is bounded.

8 Sampling in Shift-Invariant Spaces

A shift-invariant space generated by a square-integrable function ϕ is given by

$${V }_{2}(\phi ) := \{ \sum\limits_{n\in \mathbf{Z}}{c}_{n}\phi (\cdot - n) :\ \sum\limits_{n\in \mathbf{Z}}\vert {c}_{n}{\vert }^{2} < \infty \}.$$

Shift-invariant spaces have been shown to be realistic for modeling signals with smoother spectrum, and also suitable for taking into account real acquisition and reconstruction devices. The notion of shift-invariant spaces arises in approximation theory, wavelet theory, and sampling theory.

For the generator ϕ of a shift-invariant space, we usually assume that {ϕ( ⋅ − n) : n ∈ Z} consisting of all integer shifts of the generator ϕ is a Riesz basis for V 2(ϕ); i.e., there exist positive constants A and B such that

$$A \sum\limits_{n\in \mathbf{Z}}\vert {c}_{n}{\vert }^{2} \leq \vert\sum\limits_{n\in \mathbf{Z}}{c}_{n}\phi (\cdot - n){\vert }_{2}^{2} \leq B \sum\limits_{n\in \mathbf{Z}}\vert {c}_{n}{\vert }^{2}.$$

The Paley–Wiener space B π is a shift-invariant space generated by the sinc function sinc(t), and the integer shifts of the sinc function form an orthonormal basis for the Paley–Wiener space B π. The following is a representative sampling theorem for signals in a shift-invariant space V 2(ϕ) established in [40].

Theorem 3.

Let ϕ be a real continuous function such that \({\sup }_{t\in \mathbf{R}}\vert \phi (t)\vert {(1 + \vert t\vert )}^{1+\epsilon } < \infty \) for some ε > 0, \({\hat{\phi }}^{{_\ast}}(\omega ) := \sum\limits_{n\in \mathbf{Z}}\phi (n){\mathrm{e}}^{-in\omega }\neq 0\) for all ω ∈ R, and {ϕ(t − n) :  n ∈Z} is an orthonormal basis for V 2 (ϕ). Then any signal f ∈ V 2 (ϕ) can be stably reconstructed from its samples {f(n)} n∈Z on the integer lattice. Moreover,

$$f(t) = \sum\limits_{n\in \mathbf{Z}}f(n)\tilde{\phi }(t - n)\quad \mathrm{for\ all}\ f \in {V }_{2}(\phi ),$$

where \(\tilde{\phi } \in {V }_{2}(\phi )\) is defined by \(\hat{\tilde{\phi }}(\omega ) =\hat{ \phi }(\omega )/\hat{{\phi }}^{{_\ast}}(\omega ),\omega \in \mathbf{R}\).

The reader may refer to [1, 4, 35, 37, 41] and references therein for some fundamental issues to sampling theory in shift-invariant spaces.

9 Sampling in Unitarily Translation-Invariant Hilbert Spaces

In this section, we consider sampling theorems on a unitarily translation-invariant RKHS generated from a single function. To be more specific, the RKHS H ϕ has the reproducing kernel

$${k}_{\phi }(t,s) ={ \int}_{-\infty }^{\infty }\phi (u - t)\phi (u - s)\mathrm{d}u$$
(4.10)

generated by a function ϕ ∈ L 1(R) ∩ L 2(R), whose Fourier transform does not have real zeros. Examples of such a generating function ϕ includes \({({\sigma }^{2} + {t}^{2})}^{-1}\), \({\mathrm{e}}^{-{\sigma }^{2}{t}^{2} }\), and e − σ | t |  where σ > 0. The RKHS H ϕ with reproducing kernel k ϕ in (4.10) is given by

$${H}_{\phi } =\{ f :\| {f\|}_{{H}_{\phi }} < \infty \},$$
(4.11)

where

$$\|{f\|}_{{H}_{\phi }} :={ \left ( \frac{1} {2\pi }{\int}_{\mathbf{R}}\vert \hat{f}(\omega ){\vert }^{2}/\vert \hat{\phi }(\omega ){\vert }^{2}\mathrm{d}\omega \right )}^{1/2}.$$

Theorem 4 ([38]). 

Let ϕ be an integrable function on the real line such that its Fourier transform \(\hat{\phi }\) does not have real zeros and ∫ R |ϕ(t)| 2 (1 + t 2)α d t < ∞ for some α > 1, k ϕ be the reproducing kernel in (4.10), and \(\cdots < {\lambda }_{-2} < {\lambda }_{-1} < {\lambda }_{0} = 0 < {\lambda }_{1} < {\lambda }_{2} < \cdots \) be sampling points with \({\lambda }_{j+1} - {\lambda }_{j} \geq \epsilon > 0\) for all j ∈ Z. Denote by \(\mathcal{X}\) the closed subspace of the reproducing kernel Hilbert space Hϕ in (4.11) spanned by kϕ(⋅,tj),j ∈Z. Then the sampling operator

$$\mathcal{X} \ni f\longmapsto {(f({t}_{j}))}_{j\in \mathbf{Z}}$$

is stable in the sense that there exist positive constants A and B such that

$$A\|{f\|}_{{H}_{\phi }} \leq \| {(f({t}_{j})){}_{j\in \mathbf{Z}}\|}_{2} \leq B\|{f\|}_{{H}_{\phi }}\quad \mathrm{for\ all}\ f \in \mathcal{X}.$$

Moreover, the sampling expansion

$$f(t) = \sum\limits_{j\in \mathbf{Z}}f({t}_{j})\frac{{k}_{\phi }(t,{t}_{j})} {{k}_{\phi }(0,0)}$$

is valid for all \(f \in \mathcal{X}\) .

10 Sampling Signals with Finite Rate of Innovation

Signals with finite rate of innovation are those signals that can be determined by finitely many samples per unit of time [39]. The concept of signals with finite rate of innovation was introduced and studied by Martin Vetterli and his school. Prototype examples of signals with finite rate of innovation include delta pulses, narrow pulses in ultrawide band communication, mass spectrometry data in medical diagnosis, and splines with (non-)uniform knots. They also include bandlimited signals and time signals in shift-invariant spaces, which are discussed in the previous sections.

A common feature of signals with finite rate of innovation is that they have a parametric representation with a finite number of parameters per unit time. So we may model a signal f with finite rate of innovation as a superposition of impulse response of varying positions, amplitudes, and widths [34], i.e.,

$$f(t) = \sum\limits_{\lambda \in \Lambda }{c}_{\lambda }{\phi }_{\lambda }(t - \lambda ),$$
(4.12)

where each λ ∈ Λ represents the innovative position of the signal, ϕλ is the impulse response of the signal-generating device at the innovative position λ, and c λ is the amplitude of the signal at the innovation position λ. Thus the function space

$${V }_{p}(\Phi ) := \{ \sum\limits_{\lambda \in \Lambda }c(\lambda ){\phi }_{\lambda }(\cdot - \lambda ) :\ {(c(\lambda ))}_{\lambda \in \Lambda } \in {\mathcal{l}}^{p}(\Lambda )\},1 \leq p \leq \infty,$$
(4.13)

could be suitable for modeling signals with finite rate of innovation.

Sampling theory for signals with finite rate of innovation has been demonstrated to be important for accurate time estimation of ultraband communication, registration of multiview images, pattern recognition, quantification of spectra, etc. The following is a sampling theorem for signals with finite rate of innovation when the innovative position of the signal and the impulse response of the signal-generating device at the innovative position are given [8, 33]:

Theorem 5.

Let Λ,Γ be relatively separated subsets of R, Φ ={ ϕλ : λ ∈ Λ} be a family of continuous functions on R such that sup t∈R,λ∈Λ λ (t)|(1 + |t|) α < ∞ for some α > 1, and the space V 2 (Φ) be as in (4.13). Assume that Φ is a Riesz basis of V 2 (Φ), and that Γ is a stable ideal sampling set for V 2 (Φ), i.e., there exist positive constants A and B such that

$$A\|{f\|}_{2} \leq \| {(f(\gamma )){}_{\gamma \in \Gamma }\|}_{2} \leq B\|{f\|}_{2}\quad \mathrm{for\ all}\ f \in {V }_{2}(\Phi ).$$

Then there exists a displayer \(\tilde{\Psi } =\{\tilde{ {\psi }}_{\gamma } : \gamma \in \Gamma \}\) such that

$$\sup\limits_{t\in \mathbf{R},\gamma \in \Gamma }\vert {\psi }_{\gamma }(t)\vert {(1 + \vert t\vert )}^{\alpha } < \infty $$

and

$$f(t) = \sum\limits_{\gamma \in \Gamma }f(\gamma )\tilde{{\psi }}_{\gamma }(t - \gamma )\quad \mathrm{for\ all}\ f \in {V }_{2}(\Phi ).$$

Now we consider nonlinear and highly challenging problem of how to identify innovative positions and amplitudes of a signal with finite rate of innovation. For the stability for identification, the innovation positions should be separated from each other and the amplitudes at innovation positions should be above a certain level. So we may model those signals with finite rate of innovation as superposition of impulse response of active and nonactive generating devices located at unknown neighbors of a uniform grid. In [36] we assume, after appropriate scaling, that signals live in a perturbed shift-invariant space

$${V }_{\infty,\oslash } := \left \{ \sum\limits_{n\in \mathbf{Z}}{c}_{n}\varphi (\cdot - n - {\sigma }_{n})\ :\ {({c}_{k})}_{k\in \mathbf{Z}} \in {\mathcal{l}}_{\oslash }^{\infty }(\mathbf{Z})\right \}$$

with unknown perturbation σ : = (σ k ) k ∈ Z , where

$${\mathcal{l}}_{\oslash }^{\infty }(\mathbf{Z}) = \left \{c := {({c}_{ k})}_{k\in \mathbf{Z}} :\ \| {c\|}_{{\mathcal{l}}_{\oslash }^{\infty }} :=\sup\limits_{{c}_{k}\neq 0}\vert {c}_{k}\vert + \vert {c}_{k}{\vert }^{-1} < \infty \right \}.$$

A negative result for sampling in the perturbed shift-invariant space V , ⊘  is that not all signals in such a space can be recovered from their samples if φ satisfies the popular Strang-Fix condition. The reason is that one cannot determine the jitter σ0 of the signal \( \sum\limits_{k\in \mathbf{Z}}\varphi (\cdot - k - {\sigma }_{0}),{\sigma }_{0} \in \mathbf{R}\), as it has constant amplitudes and is identical for all σ0 ∈ R. On the positive side, it is shown in [36] that any signal h in a perturbed shift-invariant space with unknown (but small) jitters can be recovered exactly from its average samples ⟨h, ψ m ( ⋅ − k)⟩, 1 ≤ m ≤ M, k ∈ Z, provided that the generator φ of the perturbed shift-invariant space and the average samplers ψ m , 1 ≤ m ≤ M, satisfy the following condition:

$$\mathrm{rank}\left (\begin{array}{*{10}c} \left[\widehat{\nabla \varphi },\widehat{{\psi }}_{1}\right](\xi )&\cdots &\left[\widehat{\nabla \varphi },\widehat{{\psi }}_{M}\right](\xi ) \\ \left[\hat{\varphi },\widehat{{\psi }}_{1}\right](\xi ) &\cdots & \left[\hat{\varphi },\widehat{{\psi }}_{M}\right](\xi ) \end{array} \right ) = 2\quad \mathrm{for\ all}\ \xi \in \left[-\pi,\pi \right].$$
(4.14)

Here the bracket product [f, g] of two square-integrable functions f and g is given by \([f,g](\xi ) = \sum\limits_{l\in \mathbf{Z}}f(\xi + 2l\pi )\overline{g(\xi + 2l\pi )}\).

Theorem 6.

Let φ and ψ 1 ,…,ψ M satisfy (4.14) and have the following regularity and decay properties:

$$ \sup\limits_{t\in \mathbf{R}}\left (\vert \varphi (t)\vert + \vert \varphi ^{\prime}(t)\vert + \vert {\varphi }^{{\prime\prime}}(t)\vert + \sum\limits_{m=1}^{M}\vert {\psi }_{ m}(t)\vert \right ){(1 + \vert t\vert )}^{\alpha } < \infty $$
(4.15)

for some α > 1. Then for any L ≥ 1, there exists a positive number δ 1 ∈ (0,1∕2) such that any signal \(h(t) = \sum\limits_{k\in \mathbf{Z}}{c}_{k}\varphi (t - k - {\sigma }_{k})\) in the space V ∞,⊘ with \(\|{({c}_{k}){}_{k\in \mathbf{Z}}\|}_{{\mathcal{l}}_{\oslash }^{\infty }} \leq L\) and \(\|{({\sigma }_{k}){}_{k\in \mathbf{Z}}\|}_{\infty }\leq {\delta }_{1}\) could be reconstructed from its average sample data ⟨h,ψ m (⋅− k)⟩,1 ≤ m ≤ M,k ∈ Z, in a stable way.

11 Sampling in Reproducing Kernel Banach Subspacesof L p

Let 1 ≤ p ≤ . A bounded linear operator T on L p(R) is said to be an idempotent operator if T 2 = T. Denote the range space of the idempotent operator T on L p(R) by V p ; i.e.,

$${V }_{p} := \{Tf :\ f \in {L}^{p}(\mathbf{R})\}.$$
(4.16)

The Paley–Wiener space, finitely generated shift-invariant spaces, p-integrable spline spaces, spaces modeling signals with finite rate of innovation, and L p itself are the range space of some idempotent operators.

Denote the Wiener amalgam space by

$${W}^{1} := \left \{f \in {L}^{1}(\mathbf{R}) :\ \| {f\|}_{{ W}^{1}} := \left \|\sup\limits_{-1/2\leq z<1/2}\vert f(\cdot + z)\vert \right \|{}_{1} < \infty \right \}$$

and the modulus of continuity of a kernel function K on R ×R by

$${\omega }_{\delta }(K)(s,t) :=\sup\limits_{-\delta \leq {z}_{1},{z}_{2}\leq \delta }\vert K(s + {z}_{1},t + {z}_{2}) - K(x,y)\vert.$$

A sampling setΓ in this chapter means a relatively separated discrete subset of R; i.e.,

$${B}_{\Gamma }(\delta ) :=\sup\limits_{t\in \mathbf{R}} \sum\limits_{\gamma \in \Gamma }{\chi }_{[-\delta,\delta ]}(t - \gamma ) < \infty $$
(4.17)

for some δ > 0, where χ E is the characteristic function on a set E. A sampling set Γ is said to have gap δ > 0 if

$${A}_{\Gamma }(\delta ) :=\inf\limits_{t\in \mathbf{R}} \sum\limits_{\gamma \in \Gamma }{\chi }_{[-\delta,\delta ]}(t - \gamma ) \geq 1$$
(4.18)

[1, 3, 4]. If we assume that the idempotent operator T is an integral operator

$$Tf(s) ={ \int}_{\mathbf{R}}K(s,t)f(t)\mathrm{d}t,\ \ f \in {L}^{p}(\mathbf{R}),$$
(4.19)

whose measurable kernel K has certain off-diagonal decay and regularity,

$$ \left \|\sup\limits_{z\in \mathbf{R}}\vert K(\cdot + z,z)\vert \right \|{}_{{W}^{1}} < \infty $$
(4.20)

and

$$ \lim\limits_{\delta \rightarrow 0}\left \|{\sup }_{z\in \mathbf{R}}\vert {\omega }_{\delta }(K)(\cdot + z,z)\vert \right \|{}_{{W}^{1}} = 0,$$
(4.21)

then,

  • V p is a reproducing kernel subspace of L p(R); i.e., for any t ∈ R there exists a positive constant C t such that

    $$\vert f(t)\vert \leq {C}_{t}\|{f\|}_{{L}^{p}(\mathbf{R})}\quad \ \mathrm{for\ all}\ f \in {V }_{p}.$$
    (4.22)
  • The kernel K satisfies the “reproducing kernel property”:

    $${\int}_{\mathbf{R}}K(s,z)K(z,t)\mathrm{d}z = K(s,t)\quad \ \mathrm{for\ all}\ s,t \in \mathbf{R}.$$
    (4.23)
  • K( ⋅, t) ∈ V for any t ∈ R.

  • \({V }_{p} :=\{ \sum\limits_{\lambda \in \Lambda }c(\lambda ){\phi }_{\lambda }(t - \lambda ) :\ {(c(\lambda ))}_{\lambda \in \Lambda } \in {\mathcal{l}}^{p}(\Lambda )\}\), where Λ is a relative separated discrete subset of R and Φ = { ϕλ}λ ∈ Λ  ⊂ V p is localized in the sense that there exists a function h in the Wiener amalgam space W 1 such that ϕλ is dominated by h for every λ ∈ Λ, i.e.,

    $$\vert {\phi }_{\lambda }(t)\vert \leq h(t)\quad \mathrm{for\ all}\ \lambda \in \Lambda \ \mathrm{and}\ t \in \mathbf{R}.$$
    (4.24)
  • Signals in V p have finite rate of innovation.

  • For p = 2, an idempotent operator T with kernel K satisfying symmetric condition \(K(x,y) = \overline{K(y,x)}\) is a projection operator onto a closed subspace of L 2. In this case, the idempotent operator T and its kernel K is uniquely determined by its range space V 2 onto L 2.

The following sampling problem in the reproducing kernel space V p is established in [25].

Theorem 7.

Let 1 ≤ p ≤∞, T be an idempotent integral operator whose kernel K satisfies (4.20) and (4.21), V be the reproducing kernel subspace of Lp(R) associated with the operator T, and δ0 > 0 be so chosen that

$${r}_{0} := \left \|\sup\limits_{z\in \mathbf{R}}\vert {\omega }_{{\delta }_{0}/2}(K)(\cdot + z,z)\vert \right \|{}_{{L}^{1}(\mathbf{R})} < 1.$$
(4.25)

Then any signal f in V can be reconstructed in a stable way from its samples f(γ),γ ∈ Γ, taken on a relatively separated subset Γ of R with gap δ0 .

Similar conclusion to the one in the above sampling theorem has been established in [12, Sect. 7.5] when the kernel K of the idempotent operator T satisfies the symmetric condition \(K(x,y) = \overline{K(y,x)}\).

12 Convolution Sampling in Reproducing Kernel Banach Subspaces ofL p

In this section, we consider convolution sampling for signals in certain reproducing kernel subspaces of L p, 1 ≤ p ≤ . Here convolution sampling of a signal is ideal sampling of the convoluted signal taken on a sampling set. Precisely, given an integrable convolutor ψ and a sampling set Γ, the convolution sampling of a signal f includes two steps: Convoluting ψ with the signal f,

$$\psi {_\ast} f(t) :={ \int}_{-\infty }^{\infty }f(s)\psi (t - s)\mathrm{d}s,$$

and then sampling the convoluted signal ψ ∗ f at the sampling set Γ,

$$f\mathop{\longmapsto }\limits^{\mathrm{convoluting}}f {_\ast} \psi \mathop{\longmapsto }\limits^{\mathrm{sampling}}\{(f {_\ast} \psi ){(\gamma )\}}_{\gamma \in \Gamma }.$$

The data obtained by the above convolution sampling procedure is given by {(f ∗ ψ)(γ)}γ ∈ Γ . In [27], it is shown that any signal in the reproducing kernel subspace V p associated with an idempotent operator can be stably reconstructed from its convolution samples taken on a sampling set with small gap if and only if the convolution procedure is stable on that space.

Theorem 8.

Let 1 ≤ p ≤∞,ψ 1 ,…,ψ L be integrable functions on the real line, V p be the reproducing kernel subspace of L p in (4.16), and set Ψ = (ψ1,…,ψL)T. Assume that the kernel K of the idempotent operator T associated with the reproducing kernel space V p satisfies (4.20) and (4.21). Then the following two statements are equivalent:

  •  Ψ is a stable convolutor on V p ; i.e.,

    $$0 <\mathop{\inf}\limits_{g\in {V }_{p},\|{g\|}_{p}=1} \sum\limits_{l=1}^{L}\|{\psi }_{ l} {_\ast} {g\|}_{p} \leq \mathop{\sup}\limits_{g\in {V }_{p},\|{g\|}_{p}=1} \sum\limits_{l=1}^{L}\|{\psi }_{ l} {_\ast} {g\|}_{p} < \infty.$$
  •  Ψ is a stable convolution sampler on V p for all sampling sets having sufficiently small gap; i.e., there exists δ 0 > 0 such that

    $$0 < \mathop{\inf}\limits_{0\neq f\in {V }_{p}}\frac{ \sum\limits_{l=1}^{L}\vert \left({\psi }_{l} {_\ast} f(\gamma )\right)_{\gamma \in \Gamma }{\vert }_{p}} {\|{f\|}_{p}} \leq \mathop{\sup}\limits_{0\neq f\in {V }_{p}}\frac{ \sum\limits_{l=1}^{L}\vert \left({\psi }_{l} {_\ast} f(\gamma )\right)_{\gamma \in \Gamma }{\vert }_{p}} {\|{f\|}_{p}} < \infty $$

    holds for any sampling set Γ satisfying 1 ≤ A Γ (δ) ≤ B Γ (δ) < ∞ for some δ ∈ (0,δ 0).

The equivalence in Theorem 8 was considered in [4] under the assumption that the reproducing kernel space V p is a finitely-generated shift-invariant space.

13 Reproducing Kernel Hilbert Space Induced by SamplingExpansions

As indicated earlier, both the sampling map

$$f\longmapsto {(f({t}_{n}))}_{n=1}^{\infty }$$

and the inverse map

$${(f({t}_{n}))}_{n=1}^{\infty }\longmapsto f$$

need to be continuous in a setting where sampling expansions are to be used. Thus the evaluation functional E t f : = f(t) needs to be continuous for all t. Equivalently, the signal resides in a RKHS, even though this RKHS may not explicitly be identified. In [28], the authors have shown that, under very mild conditions, many versions of sampling theorems hold for RKHS. In [29], the authors asked whether an RKHS exists for each sampling theorem and showed that the answer is affirmative when a sampling sequence satisfies minimal properties. The starting point is an abstract notion of a sampling expansion.

Definition: Let f be a function belonging to a class \(\mathcal{F}\) of continuous functions on Ω ⊂ R. A sampling theorem is associated with \(\mathcal{F}\) if there is a sequence of sampling pairs {(S n , t n )} of functions \({S}_{n} \in \mathcal{F}\) and points t n  ∈ Ω such that

  • S n (t k ) = δ nk , where δ nk is the Kronecker delta.

  • For each \(f \in \mathcal{F}\), the sequence {f(t n )} ∈  2, i.e., ∑ n  | f(t n ) | 2 < .

  • The set {t n } is a set of uniqueness for \(\mathcal{F}\).

  • For each {b n } ∈  2 the series ∑ n b n S n (t) converges pointwise in Ω.

Then the authors construct a RKHS associated with sampling expansion as follows [29]:

Theorem 9.

Let H 0 be the Hilbert space consisting of \(\mathcal{F}\) with the inner product \(\langle f,{g\rangle }_{0} = \sum\limits_{n}f({t}_{n})\overline{g({t}_{n})}\) . Then H 0 satisfies the following:

  • H 0 is a reproducing kernel Hilbert space with \(k(t,s) = \sum\limits_{n}\overline{{S}_{n}(t)}{S}_{n}(s)\) as its reproducing kernel.

  • {S n } is an orthogonal basis for H 0.

  • The sampling expansion g(t) = ∑ n g(t n )S n (t) holds for any g ∈ H 0.

14 Sampling in Reproducing Kernel Banach Spaces

A reproducing kernel Banach space is a Banach space B of functions on a set Ω such that the evaluation functions f → f(t) is continuous for each t ∈ Ω [5]. The range space V p of an idempotent integral operator is a reproducing kernel Banach space when the kernel of the idempotent operator satisfies certain regularity conditions. In this section, we investigate sampling in a reproducing kernel Banach space.

Let 1 ≤ p ≤  and B be a Banach space with norm denoted by \(\|{\cdot \|}_{B}\). We say that a countable subset Λ of Ω is a p-sampling set for the Banach space B if

$$0 <\mathop{\inf}\limits_{f\in B,\|{f\|}_{B}=1}\|{(f(\lambda )){}_{\lambda \in \Lambda }\|}_{p} \leq \mathop{\sup}\limits_{f\in B,\|{f\|}_{B}=1}\|{(f(\lambda )){}_{\lambda \in \Lambda }\|}_{p} < \infty,$$
(4.26)

and a countable collection of elements g λ, λ ∈ Λ, in the dual space of B to be a p-frame if

$$0 <\mathop{\inf}\limits_{\|{f\|}_{B}=1}\|{({g}_{\lambda }(f)){}_{\lambda \in \Lambda }\|}_{p} \leq \mathop{\sup}\limits_{\|{f\|}_{B}=1}\|{({g}_{\lambda }(f)){}_{\lambda \in \Lambda }\|}_{p} < \infty,$$
(4.27)

i.e., the analysis operator T : B ∋ f↦(g λ(f))λ ∈ Λ  ∈  p is bounded from both above and below [2]. Similarly to sampling in a RKHS, for a reproducing kernel Banach space B of functions on a set Ω, a countable subset Λ of Ω is a p-sampling set for the space B if and only if the corresponding evaluation functionals h λ, λ ∈ Λ, form a p-frame for the space B. Moreover, in [14] it is shown that a reconstruction formula always exists.

Theorem 10.

Let 1 ≤ p,q ≤∞ satisfy \(1/p + 1/q = 1\) , B be a reproducing kernel Banach space of functions on a set Ω, and Λ ⊂ Ω be a p-sampling set. Then there exists a collection of functions S λ (t),λ ∈ Λ, such that:

  • (S λ (t)) λ∈Λ is q-summable for every t ∈ Ω.

  • λ)λ∈Λ is a p-frame for the range space of the sampling operator S : B ∋ f↦(f(λ)) λ∈Λ ∈ ℓ p , where η λ = (S λ′ (λ)) λ′∈Λ.

  • Every signal f in the reproducing kernel Banach space B has the following sampling expansion:

    $$f(t) = \sum\limits_{\lambda \in \Lambda }f(\lambda ){S}_{\lambda }(t),\ t \in \Omega,$$

    with the pointwise convergence.

15 Average Sampling in L 2

In this section, we consider very general sampling procedure where the samples are obtained by inner products between time signal and sampling functionals. More precisely, given a time signal f living in a Hilbert space H, its average sample y γ at the location γ ∈ Γ is obtained by taking the inner product between the signal f and the sampling functional ψγ( ⋅ − γ) at the location γ; i.e., the sampling procedure on H via the average sampler \(\Psi = {({\psi }_{\gamma }(\cdot - \gamma ))}_{\gamma \in \Gamma }\) is a linear operator from H to 2(Γ):

$$S : H \ni f\longmapsto \{{y}_{\gamma } :=\langle f,{\psi }_{\gamma }{(\cdot - \gamma )\rangle \}}_{\gamma \in \Gamma } \in {\mathcal{l}}^{2}(\Gamma ).$$
(4.28)

We restrict ourselves to consider well-localized samplers \(\Psi = {({\psi }_{\gamma }(\cdot - \gamma ))}_{\gamma \in \Gamma }\), which means that Γ is a relatively separated subset of R and the sampling functionals ψγ are dominated by a function h in the Wiener amalgam space W 1; i.e., | ψγ(t) | ≤ h(t) for all t ∈ R and γ ∈ Γ. The reasons for considering well-localized samplers are twofold:

  • At each position γ ∈ Γ, we locate an acquisition device, and hence it is reasonable to assume that there are finitely many such acquisition devices in any unit intervals, which in turn implies that Γ is relatively separated.

  • We use the sampling functional ψγ to reflect the characteristic of the acquisition device at the location γ, and hence the sampling functional ψγ should essentially be supported in a neighborhood of the sampling location γ, which can be described by the dominance by a function h with fast decay at infinity.

It is well known that signals with finite energy do not have finite rate of innovation. In [26], we show that any signal f with finite energy could be determined by its samples ⟨f, ψγ( ⋅ − γ)⟩, γ ∈ Γ for some well-localized sampler (ψγ( ⋅ − γ))γ ∈ Γ , but could not be recovered in a stable way from the samples ⟨f, ψγ( ⋅ − γ)⟩, γ ∈ Γ for any well-localized sampler (ψγ( ⋅ − γ))γ ∈ Γ .

Theorem 11.

  • There is a well-localized sampler (ψ γ (⋅− γ)) γ∈Γ such that any function f ∈ L 2 is uniquely determined by its samples ⟨f,ψ γ (⋅− γ)⟩,γ ∈ Γ.

  • There does not exist a well-localized sampler (ψ γ (⋅− γ)) γ∈Γ such that the sampling operator S in (4.28) is stable for H = L 2 in the sense that there exist positive constants A and B such that \(A\|{f\|}_{2}^{2} \leq\sum\limits_{\gamma \in \Gamma }\vert \langle f,{\psi }_{\gamma }(\cdot - \gamma )\rangle {\vert }^{2} \leq B\|{f\|}_{2}^{2}\) for all f ∈ L 2.

We remark that functions ψγ, γ ∈ Γ, in the well-localized sampler in the first conclusion of Theorem 11 cannot be selected to be supported in a fixed compact set, but it is possible to let elements ψγ, γ ∈ Γ, in the well-localized sampler to be independent on γ ∈ Γ. This is closely related to the spectral problem: the density of the collection of exponentials {exp(iγt)}γ ∈ Γ in a weighted L 2 space [31]. In [26], we conjecture that there is not a determining sampler {ψγ( ⋅ − γ) |  γ ∈ Γ} such that \(\|{\psi {}_{\gamma }\|}_{2} = 1\) and | ψγ(x) | ≤ Cexp( − ε | x | ) for some positive constants C, ε and a relatively separated subset Γ of R.

Dedication. This chapter is dedicated to Professor Gilbert Walter on the occasion of his 80th birthday:

  • In appreciation of his friendship and important contributions to Mathematical Analysis and Applications.

  • With admiration of the novel and clever ways in which he has brought together ideas from classical and modern analysis to advance our understanding of generalized functions, wavelets, and signal processing.