1 Introduction

In ocean data assimilation (or analysis), the coordinates (x, y, z) are usually represented by the position vector r with grid points represented by r n , n = 1, 2, …, N, and observational locations represented by r (m), m = 1, 2, …, M. Here, N is the total number of the grid points, and M is the total number of observational points. A single or multiple variables c = (u, v, T, S, …), no matter two or three dimensional, can be ordered by grid point and by variable, forming a single vector of length NP with N the total number of grid points and P the number of variables. For multiple variables, non-dimensionalization is conducted before forming a single vector c (Chu et al. 2015) with “true”, analysis, and background fields (c t , c a, c b ) and observational data (c o ) being represented by N and M dimensional vectors,

$$ {\mathbf{c}}_{t,a,b}^T=\left[{c}_{t,a,b}\left({\mathbf{r}}_1\right),{c}_{t,a,b}\left({\mathbf{r}}_2\right),\dots, {c}_{t,a,b}\left({\mathbf{r}}_N\right)\right],\kern0.75em {\mathbf{c}}_o^T=\left[{c}_o\left({\mathbf{r}}^{(1)}\right),{c}_o\left({\mathbf{r}}^{(2)}\right),\dots, {c}_o\left({\mathbf{r}}^{(M)}\right)\right], $$
(1)

where the superscript ‘T’ means transpose. The innovation (or called the observational increment

$$ \mathbf{d}\equiv \left({\mathbf{c}}_o-\mathbf{H}{\mathbf{c}}_b\right), $$
(2)

represents the difference between the observational and background data at the observational points r (m). Here, H = [h mn ] is an M × N linear observation operator matrix converting the background field c b (at the grid points, r n ) into “first guess observations” at the observational points r (m) (Fig. 1).

Fig. 1
figure 1

Illustration of ocean data assimilation with c b located at the grid points, and c o located at the points (asterisk). The ocean data assimilation is to convert the innovation, d = c o  − H c b , from the observational points to the grid points

The analysis error (ε a ) and observational error (ε o ) are defined by

$$ {\boldsymbol{\upvarepsilon}}_a={\mathbf{c}}_a-{\mathbf{c}}_t,\kern0.75em {\boldsymbol{\upvarepsilon}}_o\equiv {\mathbf{H}}^T{\mathbf{c}}_o-{\mathbf{c}}_t, $$
(3a)

which are evaluated at the grid points. The two errors are usually independent of each other,

$$ \left\langle {\boldsymbol{\upvarepsilon}}_o^T{\boldsymbol{\upvarepsilon}}_a\right\rangle =0,\kern1em \left\langle \right\rangle \equiv \frac{1}{N-1}{\displaystyle \sum_{n=1}^N\left[\right]}. $$
(3b)

Minimization of the analysis error variance

$$ {E}^2=\left\langle {\boldsymbol{\upvarepsilon}}_a^T{\boldsymbol{\upvarepsilon}}_a\right\rangle \to \min $$
(4)

gives the optimal analysis field c a for the “true” field c t .

A common practice in ocean data assimilation (or analysis) is to use a N × M weight matrix W = [w nm ] to blend c b (at the grid points r n ) with innovation d (at observational points r (m)) (Evensen 2003; Tang and Kleeman 2004; Chu et al. 2004a; 2015; Galanis et al. 2006; Oke et al. 2008; Han et al. 2013; Yan et al. 2015)

$$ {\mathbf{c}}_a={\mathbf{c}}_b+\mathbf{W}\mathbf{d}. $$
(5)

Minimization of the analysis error variance with respect to weights,

$$ \partial {E}^2/\partial {w}_{nm}=0. $$
(6)

determines the weight matrix

$$ \mathbf{W}=\mathbf{B}{\mathbf{H}}^T{\left(\mathbf{H}\mathbf{B}{\mathbf{H}}^T+\mathbf{R}\right)}^{-1}. $$
(7)

Here, B is the N × N background error covariance matrix; R is the M × M observational error covariance matrix and is usually simplified as a product of an observational error variance (\( {e}_o^2 \)) and an identity matrix I,

$$ \mathbf{R}={e}_o^2\mathbf{I}. $$
(8)

Substitution of (7) into (5) leads to the optimal interpolation (OI) equation,

$$ {\mathbf{c}}_a={\mathbf{c}}_b+\mathbf{B}{\mathbf{H}}^T{\left(\mathbf{H}\mathbf{B}{\mathbf{H}}^T+\mathbf{R}\right)}^{-1}\mathbf{d}, $$
(9)

which produces the analysis field c a from the innovation d. The challenge for the OI method is the determination of the background error covariance matrix B.

An alternative approach is to use a spectral method with lateral boundary (Г) information to decompose the variable anomaly at the grid points [c(r n ) − c b (r n )] into (Chu et al. 2015),

$$ {c}_a\left({\mathbf{r}}_n\right)-{c}_b\left({\mathbf{r}}_n\right)={s}_K\left({\mathbf{r}}_n\right),\kern0.75em {s}_K\left({\mathbf{r}}_n\right)\equiv {\displaystyle \sum_{k=1}^K{a}_k\ }{\phi}_k\left({\mathbf{r}}_n\right), $$
(10)

where {ϕ k } are basis functions; K is the mode truncation. The eigenvectors of the Laplace operator with the same lateral boundary condition of (c − c b ) can be used as the set of the basis functions {ϕ k } and written in matrix (Chu et al. 2015)

$$ \boldsymbol{\Phi} =\left\{{\phi}_{kn}\right\}=\left[\begin{array}{cccc}\hfill {\phi}_1\left({\mathbf{r}}_1\right)\hfill & \hfill {\phi}_2\left({\mathbf{r}}_1\right)\hfill & \hfill \dots \hfill & \hfill {\phi}_K\left({\mathbf{r}}_1\right)\hfill \\ {}\hfill {\phi}_1\left({\mathbf{r}}_2\right)\hfill & \hfill {\phi}_2\left({\mathbf{r}}_2\right)\hfill & \hfill \dots \hfill & \hfill {\phi}_K\left({\mathbf{r}}_2\right)\hfill \\ {}\hfill \dots \hfill & \hfill \dots \hfill & \hfill \dots \hfill & \hfill \dots \hfill \\ {}\hfill {\phi}_1\left({\mathbf{r}}_N\right)\hfill & \hfill {\phi}_2\left({\mathbf{r}}_N\right)\hfill & \hfill \dots \hfill & \hfill {\phi}_K\left({\mathbf{r}}_N\right)\hfill \end{array}\right]. $$
(11)

For a given mode truncation K, minimization of the analysis error variance (4) with respect to the spectral coefficients

$$ \partial {E}_K^2/\partial {a}_k=0,\kern0.75em k=1,\ldots,K $$
(12)

gives the spectral ocean data assimilation equation (Chu et al. 2004b, 2015),

$$ {\mathbf{c}}_a={\mathbf{c}}_b+\mathbf{F}{\boldsymbol{\Phi}}^T{\left[\boldsymbol{\Phi} \mathbf{F}{\boldsymbol{\Phi}}^T\right]}^{-1}\boldsymbol{\Phi} {\mathbf{H}}^T\mathbf{d}, $$
(13)

where F is an N × N (diagonal) observational contribution matrix

$$ \mathbf{F}=\left[\begin{array}{cccccc}\hfill {f}_1\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill {f}_2\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill \ddots \hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill {f}_n\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill \ddots \hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill {f}_N\hfill \end{array}\right],\kern1.25em {f}_n\equiv {\displaystyle \sum_{m=1}^M{h}_{nm}}. $$
(14)

Here, the matrices Φ, F, and H are all given in comparison to the OI Eq. (9) where the background error covariance matrix B needs to be determined.

This spectral method has been proven effective for the ocean data analysis. Chu et al. (2003a, b) named the spectral method as the optimal spectral decomposition (OSD). With it, several new ocean phenomena have been identified from observational data such as a bi-modal structure of chlorophyll-a with winter/spring (February–March) and fall (September–October) blooms in the Black Sea (Chu et al. 2005a), fall–winter recurrence of current reversal from westward to eastward on the Texas–Louisiana continental shelf from the current-meter, near-surface drifting buoy (Chu et al. 2005b), propagation of long Rossby waves at mid-depths (around 1000 m) in the tropical North Atlantic from the Argo float data (Chu et al. 2007), and temporal and spatial variability of the global upper ocean heat content (Chu 2011) from the data of the Global Temperature and Salinity Profile Program (GTSPP, Sun et al. 2009).

The spectral mode truncation is the key for the success of the OSD method. It acts as a spatial low pass filter for the fields to allow the highest wave numbers corresponding to the highest spectral eigenvalues without aliasing due to the information provided from the observational network.

Questions arise: Can a simple and effective mode truncation method be developed to take into account of model resolution (i.e., total number of model grid points)? What are the major differences between OI and OSD? What is the quality and uncertainty of the OSD method? The purpose of this paper is to answer these questions. The remainder of the paper is organized as follows. Section 2 describes error analysis. Section 3 presents the steep-descending mode truncation method. Section 4 shows idealized “truth” and “observational” fields. Section 5 compares analysis fields between OSD and OI. Section 6 introduces three synoptic monthly gridded world ocean temperature, salinity, and absolute geostrophic velocity datasets produced with the OSD method and quality controlled by the NOAA National Centers for Environmental Information (NCEI). Conclusions are given in Section 7. Appendices A and B briefly describe several methods to determine the H matrix. Appendix C shows the determination of basis functions. Appendix D presents the Vapnik-Chervonenkis dimension for mode truncation. Appendix E depicts a special B matrix for this study.

2 Error analysis

Low mode truncation does not represent the reality well, while high mode truncation may contain too much noise. Let the truncated spectral representation s K in (10) at the grid points form an N-dimensional vector,

$$ {\mathbf{s}}_K^T=\left[{s}_K\left({\mathbf{r}}_1\right),{s}_K\left({\mathbf{r}}_2\right),\dots, {s}_K\left({\mathbf{r}}_N\right)\right]. $$
(15)

The M-dimensional innovation vector [see (2)]

$$ {\mathbf{d}}^T=\left[d\left({\mathbf{r}}^{(1)}\right),d\left({\mathbf{r}}^{(2)}\right),\dots, d\left({\mathbf{r}}^{(M)}\right)\right] $$

at observational points can be transformed into the grid points

$$ {D}_n\equiv D\left({\mathbf{r}}_n\right)=\frac{{\displaystyle \sum_{m=1}^M{h}_{nm}{d}^{(m)}}}{f_n},\kern0.75em {f}_n\equiv {\displaystyle \sum_{m=1}^M{h}_{nm}}, $$
(16)

where D(r n) represents the observational innovation at the grid points,

$$ D\left({\mathbf{r}}_n\right)={c}_o\left({\mathbf{r}}_n\right)-{c}_b\left({\mathbf{r}}_n\right). $$
(17)

From Eq. (3a), observations at grid points are computed using c o(r n) = H T c o (r m ). The original background state, c b (r n ), keeps in the grid space. The matrix form of (16) is

$$ \mathbf{F}\mathbf{D}={\mathbf{H}}^T\mathbf{d}, $$
(18)

where f n denotes contribution of all observational data unto the grid point r n . The larger the value of f n , the larger the observational influence on that grid point (r n). D is an N-dimensional vector at the grid points,

$$ {\mathbf{D}}^T=\left({D}_1,{D}_2,\dots, {D}_N\right) $$
(19)

The analysis error (i.e., analysis ca versus “truth” c t ) in the spectral data assimilation [see (10)] is given by

$$ \begin{array}{l}{\varepsilon}_a\left({\mathbf{r}}_n\right)\equiv {c}_a\left({\mathbf{r}}_n\right)-{c}_t\left({\mathbf{r}}_n\right)\\ {}=\left[{c}_a\left({\mathbf{r}}_n\right)-{c}_b\left({\mathbf{r}}_n\right)\right]-\left[{c}_o\left({\mathbf{r}}_n\right)-{c}_b\left({\mathbf{r}}_n\right)\right]+\left[{c}_o\left({\mathbf{r}}_n\right)-{c}_t\left({\mathbf{r}}_n\right)\right]\\ {}={s}_K\left({\mathbf{r}}_n\right)-D\left({\mathbf{r}}_n\right)+{\varepsilon}_o\left({\mathbf{r}}_n\right)\end{array} $$
(20)

Here, (10) and (17) are used. The analysis error is decomposed into two parts

$$ {\varepsilon}_a\left({\mathbf{r}}_n\right)={\varepsilon}_K\left({\mathbf{r}}_n\right)+{\varepsilon}_o\left({\mathbf{r}}_n\right), $$
(21)

with the truncation error given by

$$ {\varepsilon}_K\left({\mathbf{r}}_n\right)={s}_K\left({\mathbf{r}}_n\right)-D\left({\mathbf{r}}_n\right), $$
(22a)

and the observational error given by

$$ {\varepsilon}_o\left({\mathbf{r}}_n\right)={c}_o\left({\mathbf{r}}_n\right)-{c}_t\left({\mathbf{r}}_n\right). $$
(22b)

3 Steep-descending mode truncation

The Vapnik-Chervonenkis dimension (Vapnik 1983; Chu et al. 2003a, 2015) was used to determine the optimal mode truncation K OPT. As depicted in Appendix D, it depends only on the ratio of the total number of observational points (M) versus spectral truncation (K) and does not depend on the total number of model grid points (N). This method neglects observational error and ignores the model resolution. In fact, the analysis error variance over the whole domain is given by

$$ {E}_a^2\equiv \left\langle \left[{\boldsymbol{\upvarepsilon}}_a^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_a\right]\right\rangle =\left\langle \left[{\boldsymbol{\upvarepsilon}}_K^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_K\right]\right\rangle +2\left\langle \left[{\boldsymbol{\upvarepsilon}}_K^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right]\right\rangle +\left\langle \left[{\boldsymbol{\upvarepsilon}}_o^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right]\right\rangle, \kern0.5em \left\langle \left[{\boldsymbol{\upvarepsilon}}_o^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right]\right\rangle =\frac{M}{N}{e}_o^2, $$
(23)

where \( {e}_o^2 \) is the observational error variance [see (8)]. Here, the observational error is assumaed the same at grid points as at the grid points. This is due to the simplification of the error covariance matrix R = \( {e}_o^2 \) I. The Cauchy-Schwarz inequality shows that

$$ \begin{array}{l}{E}_a^2\le \left\langle \left[{\boldsymbol{\upvarepsilon}}_K^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_K\right]\right\rangle +2\sqrt{\left\langle \left[{\boldsymbol{\upvarepsilon}}_K^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_K\right]\right\rangle}\sqrt{\left\langle \left[{\boldsymbol{\upvarepsilon}}_o^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right]\right\rangle }+\left\langle \left[{\boldsymbol{\upvarepsilon}}_o^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right]\right\rangle \\ {}={E}_K^2+2{E}_K\sqrt{M/N{e}_o}+\left(M/N\right){e}_o^2\end{array} $$
(24)

The relative analysis error reduction at the mode-K can be expressed by the ratio

$$ {\gamma}_K=\mathit{\ln}\left[\frac{E_{K-1}^2+2{E}_{K-1}\sqrt{M/N{e}_o}+M{e}_o^2/N}{E_K^2+2{E}_K\sqrt{M/N{e}_o}+M{e}_o^2/N}\right],\kern1.25em K=2,3,\dots $$
(25)

Both E K and E K-1 are large for small K (low-mode truncation), which may lead to a small value of γ K . Both E K and E K-1 are small for large K (high-mode truncation), which also leads to a small value of γ K . An optimal truncation should be between the low-mode and high-mode truncations with a larger value (over a threshold) of γ K . This procedure is illustrated as follows. The values (γ 2, γ 2, …, γ KB ) are calculated using (25) from a large K B (say 250). The mean and standard deviation of γ can be computed as,

$$ \overline{\gamma}=\frac{1}{K_B-1}{\displaystyle \sum_{K=2}^{K_B}{\gamma}_K},s=\sqrt{\frac{1}{K_B-2}{\displaystyle \sum_{K=2}^{K_B}{\left({\gamma}_K-\overline{\gamma}\right)}^2}}. $$
(26)

Suppose that the relative error reductions (γ 2, γ 3, …, γ KB ) satisfy the Gaussian distribution. A 100(1 − α) % upper one-sided confidence bound on γ is given by

$$ {\gamma}_{th}=\overline{\gamma}+{z}_{\alpha }s, $$
(27)

which is used as the threshold for the mode truncation. Here, z is the random variable satisfying the Gaussian distribution with zero mean and standard deviation of 1. If several γ values exceed the threshold, the highest mode

$$ {K}_{\mathrm{OPT}}=\underset{\gamma_K\ge {\gamma}_{th}}{\mathit{\max}}(K) $$
(28)

is selected for mode truncation. After the mode truncation K OPT is determined, the spectral coefficients (a k , k = 1, 2, …, K OPT) can be calculated, and so as the truncation error variance \( {E}_{K_{\mathrm{OPT}}}^2 \).

3.1 Multi-platform observations

Let observation be conducted by L instruments with different \( {e}_o^{(l)} \)deployed at \( {\mathbf{r}}_l^{\left({m}_l\right)} \) (m l  = 1, 2, .., M L ; l = 1, 2, …, L). The total number of observations is \( M={\displaystyle \sum_{l=1}^L{M}_l} \). The M-dimensional observational vector is represented by

$$ {\mathbf{c}}_o^T=\left[\begin{array}{l}{c}_o\left({\mathbf{r}}_1^{(1)}\right),{c}_o\left({\mathbf{r}}_1^{(2)}\right),\dots, {c}_o\left({\mathbf{r}}_1^{\left({M}_1\right)}\right),{c}_o\left({\mathbf{r}}_2^{(1)}\right),{c}_o\left({\mathbf{r}}_2^{(2)}\right),\dots, {c}_o\left({\mathbf{r}}_2^{\left({M}_2\right)}\right),\dots, \\ {}{c}_o\left({\mathbf{r}}_L^{(1)}\right),{c}_o\left({\mathbf{r}}_L^{(2)}\right),\dots, {c}_o\left({\mathbf{r}}_L^{\left({M}_L\right)}\right)\end{array}\right] $$
(29)

The observational error variance is given by

$$ \left\langle {\boldsymbol{\upvarepsilon}}_o^T\mathbf{F}{\boldsymbol{\upvarepsilon}}_o\right\rangle ={M}_1{\left({e}_o^{(1)}\right)}^2+{M}_2{\left[{e}_o^{(2)}\right]}^2+\dots +{M}_L{\left[{e}_o^{(L)}\right]}^2. $$
(30)

The relative error reduction γ K for mode truncation (25) is replaced by

$$ {\gamma}_K=\mathit{\ln}\left[\frac{E_{K-1}^2+2{E}_{K-1}{\displaystyle \sum_{l=1}^L\sqrt{M_l/N}{e}_o^{(l)}}+{\displaystyle \sum_{l=1}^L{M}_l{\left({e}_o^{(l)}\right)}^2}/N}{E_K^2+2{E}_K{\displaystyle \sum_{l=1}^L\sqrt{M_l/N}{e}_o^{(l)}}+{\displaystyle \sum_{l=1}^L{M}_l{\left({e}_o^{(l)}\right)}^2/N}}\right],\kern1.25em K=2,3,\dots $$
(31)

After the mode truncation is determined, the OSD Eq. (13) is used to get the analysis field.

4 “Truth,” “background,” and “observational” fields

Consider an artificial non-dimensional horizontal domain (−19 < x < 19, −15 < y < 15) with the four curved rigid boundaries (Fig. 2):

$$ \begin{array}{l}\frac{x}{10}-0.3 \cos \left(\frac{y}{8}\right) \sin \left(\frac{x}{10}\right)=\xi =\left\{\begin{array}{c}\hfill -\pi /2\kern1.25em \left(\mathrm{west}\right)\hfill \\ {}\hfill \kern0.75em \pi /2\kern1.5em \left(\mathrm{east}\right)\kern2.5em \hfill \end{array}\right.\\ {}\frac{y}{8}-0.2 \sin \left(\frac{x}{5}\right)\left[1- \cos \left(\frac{y}{8}\right)\right]=\eta =\left\{\begin{array}{c}\hfill -\pi /2\kern1.25em \left(\mathrm{south}\right)\hfill \\ {}\hfill \kern0.75em \pi /2\kern1.5em \left(\mathrm{north}\right)\kern2.5em \hfill \end{array}\right.\end{array} $$
(32)
Fig. 2
figure 2

Horizontal non-dimensional domain with four curved rigid boundaries with each boundary given by Eq. (32)

The domain is discretized with Δx = Δy = 0.5. The total number of the grid points inside the domain (N) is 3569. Figure 3 shows the first 12 basis functions {ϕ k }, which are the eigenvectors of the Laplacian operator with the Dirichlet boundary condition, i.e., b 1 = 0 in (61) of Appendix C.

Fig. 3
figure 3

Basis functions from ϕ1 to ϕ12 for the domain depicted by Eq. (32)

The first basis function ϕ 1(x n ) shows a one-gyre structure. The second and third basis functions ϕ 2(x n )and ϕ 3(x n )show the east-west and north-south dual-eddies. The fourth basis function ϕ 4(x n ) shows the east-west slanted dipole-pattern with opposite signs in the northeastern region (positive) and the southwestern region (negative). The fourth basis function ϕ 4(x n ) shows the tripole-pattern with negative values in the western and eastern regions and positive values in between. The higher order basis functions have more complicated variability structures.

Two “truth” fields for the non-dimensional domain with 4 rigid and curved boundaries (Fig. 2) contain multiple mesoscale eddies (treated as “truth”) given by

$$ \left\{\begin{array}{c}\hfill {c}_t\left(x,y\right)=25-{y}^2/40+3 \cos \left[{L}_x\xi \left(x,y\right)\right]\mathit{\sin}\left[{L}_y\eta \left(x,y\right)+\beta \right]\hfill \\ {}\hfill \begin{array}{l}\xi =\frac{x}{10}-0.3 \cos \left(\frac{y}{8}\right) \sin \left(\frac{x}{10}\right),\kern0.75em \eta =\frac{y}{8}-0.2 \sin \left(\frac{x}{5}\right)\left[1- \cos \left(\frac{y}{8}\right)\right]\\ {}\left({L}_x,{L}_y,\beta \right)=\left(3,2,\pi /2\right)\end{array}\hfill \end{array}\right., $$
(33)

for the large-eddy field (Fig. 4a) and given by

$$ \left\{\begin{array}{c}\hfill {c}_t\left(x,y\right)=25-{y}^2/40+3 \cos \left[{L}_x\xi \left(x,y\right)\right]\mathit{\cos}\left[{L}_y\eta \left(x,y\right)+\beta \right]\hfill \\ {}\hfill \begin{array}{l}\xi =\frac{x}{10}-0.3 \cos \left(\frac{y}{8}\right) \sin \left(\frac{x}{10}\right),\kern0.75em \eta =\frac{y}{8}-0.2 \sin \left(\frac{x}{5}\right)\left[1- \cos \left(\frac{y}{8}\right)\right]\\ {}\left({L}_x,{L}_y,\beta \right)=\left(7,5,0\right)\end{array}\hfill \end{array}\right. $$
(34)
Fig. 4
figure 4

“Truth” field c t taken as a the analytical function (33) with large-scale eddy field L x  = 3, L y  = 2, β = π/2, and b the analytical function (34) with small-scale eddy field L x  = 7, L y  = 5, β = 0

for the small-eddy field (Fig. 4b). The background field is given by

$$ {c}_b\left(x,y\right)=25-{y}^2/40 $$
(35)

The “observational” points {r (m)} are randomly selected inside the domain (Fig. 5) with the total number (M) of 300. The “observational” points {r (m)} are kept the same for all the sensitivity studies. The domain is discretized by Δx = Δy = 0.5 with total number (N) of grid points of 3569.

Fig. 5
figure 5

Randomly selected locations (total: 300) inside the domain as “observational” points

Sixteen sets of “observations” (c o ) are constructed from Fig. 4a, b using the analytical values plus white Gaussian noises (ε o ) of zero mean and various standard deviations (σ) from 0 (no noise) to 2.0 with 0.1 increment from 0 to 1.0 and 0.2 increment from 1.0 to 2.0 (total 16 sets), generated by the MATLAB,

$$ {c}_o\left({\mathbf{r}}^{(m)}\right)={c}_t\left({\mathbf{r}}^{(m)}\right)+{\varepsilon}_o\left({\mathbf{r}}^{(m)}\right). $$
(36)

Figure 6a, b show 6 out of the 16 constructed sets with σ = (0, 0.2, 0.5, 10., 1.6, 2.0). Both OSD and OI methods are used to get the analysis field ca(r n) from these “observations”. The bilinear interpolation (see Appendix B) is used for the observation operator H in this study.

Fig. 6
figure 6figure 6

a “Observational” data (c o ) from Fig. 4a with added white Gaussian noises of zero mean and various standard deviations: a 0 (i.e., no noise), b 0.2, c 0.5, d 1.0, e 1.6, and f 2.0. b “Observational” data (c o ) from Fig. 4b with added white Gaussian noises of zero mean and various standard deviations: a 0 (i.e., no noise), b 0.2, c 0.5, d 1.0, e 1.6, and f 2.0

5 Comparison between OSD and OI

  1. a.

    OSD analysis fields

The steep-descending mode truncation K OPT depends on the user-input parameter e o [see (25)] and observational noise σ. \( {E}_a^2 \) and γ K are computed from the “observational” data in Fig. 6a, b. The threshold of mode truncation (27) varies with the significance level α. In this study, (e 0, σ) vary between 0 and 2; α has two levels of (0.05, 0.10) with z 0.05 = 1.645, z 0.10 = 1.287 in (27). For given values of e 0 (= 0.2) and σ (= 0.8), the optimal mode truncation depends on the significance level α with K OPT = 58 for α = 0.05 (Fig. 7a) and K OPT = 67 for α = 0.10 (Fig. 7b). Most results shown in this section is for α = 0.05 since it it a commonly used significance level.

Fig. 7
figure 7

Dependence of \( {E}_a^2 \) and γ K on K for the “observational” data for the small-scale eddy field with σ = 0.8 and e o = 0.2 at two significant levels of a α = 0.05 (z 0.05 = 1.645) and b α = 0.10 (z 0.10 = 1.291) as the threshold of mode truncation [see Eq. (27)]. The optimal mode truncation is 58 for α = 0.05 and 67 for α = 0.10

For the large-eddy field, K OPT is not sensitive to the values of σ and e o . It is 7 in the upper-left portion and 6 in the lower-right portion of Table 1. For the small-eddy field, K OPT takes (58, 67) for the most cases, 178 for the high noise levels (σ ≥ 1.8) and low e o values (e o  ≤ 1.0), and 82 for the low noise levels (σ ≤ 0.1) and low e o values (e o  ≤ 0.3) (Table 2).

Table 1 Dependence of K OPT on (σ, e o ) for the large-eddy field shown in Fig. 6a with significance level α = 0.05
Table 2 Dependence of K OPT on (σ, e o ) for the small-eddy field shown in Fig. 6b with significance level α = 0.05

The analysis field using the OSD data assimilation (13) for a particular user-input parameter e o and noise level σ, \( {c}_a^{OSD}\left({\mathbf{r}}_n,{e}_o,\sigma \right) \), is represented in Fig. 8a (the large-eddy field) using “observations” in Fig. 6a (with various σ), and in Fig. 8b (the small-eddy field) using “observations” in Fig. 6b (with various σ). Comparison between Figs. 8a, b and 4a, b demonstrates the capability of the OSD method with the analysis fields \( {c}_a^{OSD}\left({\mathbf{r}}_n,\sigma, {e}_o\right) \) fully reconstructed for all occasions.

  1. b.

    OI Analysis Fields

With the assumption that the c field is statistically stationary and homogeneous, the OI Eq. (9) with the R and B matrices represented by (8) and (65) [see Appendix E] is used to analyze the “observational” data with three user-defined paramters: (r a , r b , e o ). Here, r a and r b are the decorrelation scale and zero crossing (r b  > r a ); e o is the standard deviation of the observational error. Let these paramters take discrete values with total number of P a for r a , P b for r b , and P e for e o . In this study, we set P a  = P b  = P e  = 5. e o has five values (0.2, 0.5, 1.0, 1.5, 2.0). Considering the horizontal domain from −15 to 15 in both (x, y) directions, r a takes 5 values (2, 3, 4, 5, 6); (r b - r a ) takes 5 values (0.5, 1.0, 1.5, 2.0, 2.5). There are 125 combinations of (r a , r b , e o ) for the test.

Fig. 8
figure 8figure 8

a The analysis field ca obtained by the spectral data assimilation [see Eq. (13)] using the steep-descending mode truncation with the significance level of α = 0.05 from the “observations” shown in Fig. 6a with six noise (σ) levels (0, 0.2, 0.5, 1.0, 1.6, 2.0) and six values of e o : a 0.2, (i.e., no noise), b 0.5, c 1.0, and d 2.0. b. The analysis field ca obtained by the spectral data assimilation [see Eq. (13)] using the steep-descending mode truncation with the significance level of α = 0.05 from the “observations” shown in Fig. 6a with six noise (σ) levels (0, 0.2, 0.5, 1.0, 1.6, 2.0) and four values of e o : a 0.2, (i.e., no noise), b 0.5, c 1.0, and d 2.0

The analysis field from the OI data assimilation (9), \( {c}_a^{OI}\left({\mathbf{r}}_n,\sigma, {r}_a,{r}_b,{e}_o\right) \), with four different sets of user-input parameters (r a , r b - r a , e o ): (2, 2.5, 1), (4, 5.5, 1), (6, 8.5, 1), and (6, 8.5, 2), are presented in Fig. 9a (the large-eddy field) using “observations” in Fig. 6a, and in Fig. 9b (the small-eddy field) using “observations” in Fig. 6b. Comparison between Figs. 9a, b and 4a, b demonstrates strong dependence of the OI output on the selection of the parameters (r a , r b , e o ). For the large-scale eddies (Fig. 9a), the analysis fields ca are very different from the “truth” field c t for r a  = 2, r b  = 2.5, e o  = 1 for all “observations” (Fig. 6a); the difference between the reconstructed and “truth” fields decreases as r a and r b increase; the two fields are quite similar when r a  = 6, r b  = 8.5 for both e o  = 1 and 2. Such similarity reduces with increasing e o . For the small-scale eddies (Fig. 9b), the analysis fields ca are totally different from the “truth” field c t for r a  = 6, r b  = 8.5, e o  = 1 and 2 for all “observations” (Fig. 6b), less different as r a and r b decrease; and are quite similar to c t when r a  = 2, r b  = 2.5, e o  = 1.

  1. c.

    Root mean square error

The analysis field from OSD, \( {c}_a^{OSD} \), depends only on the observational error variance \( {e}_o^2 \) and its uncertainty is represented by the root mean square error R OSD,

$$ {R}^{OSD}\left(\sigma, {e}_o\right)=\sqrt{\frac{1}{N}{\displaystyle \sum_{n=1}^N{\left[{c}_a^{OSD}\left({\mathbf{r}}_n,\sigma, {e}_o\right)-{c}_t\left({\mathbf{r}}_n\right)\right]}^2}}. $$
(37a)

Average over all the values of e o leads to the overall uncertainty

$$ {\overline{R}}^{OSD}\left(\sigma \right)=\sqrt{\frac{1}{N{P}_e}{\displaystyle \sum_{e_o}{\displaystyle \sum_{n=1}^N{\left[{c}_a^{OSD}\left({\mathbf{r}}_n,\sigma, {e}_o\right)-{c}_t\left({\mathbf{r}}_n\right)\right]}^2}}}. $$
(37b)

The analysis field using OI (\( {c}_a^{OI} \)) depends on three user-defined parameters (r a , r b , e o ). Its uncertainty due to a particular parameter is represented by

$$ {R}^{OI}\left(\sigma, {r}_a\right)=\sqrt{\frac{1}{N{P}_b{P}_e}{\displaystyle \sum_{r_b}{\displaystyle \sum_{e_o}{\displaystyle \sum_{n=1}^N{\left[{\psi}_a^{OI}\left({\mathbf{r}}_n,\sigma, {r}_a,{r}_b,{e}_o\right)-{\psi}_t\left({\mathbf{r}}_n\right)\right]}^2}}}}, $$
(38a)
$$ {R}^{OI}\left(\sigma, {r}_b\right)=\sqrt{\frac{1}{N{P}_a{P}_e}{\displaystyle \sum_{r_a}{\displaystyle \sum_{e_o}{\displaystyle \sum_{n=1}^N{\left[{\psi}_a^{OI}\left({\mathbf{r}}_n,\sigma, {r}_a,{r}_b,{e}_o\right)-{\psi}_t\left({\mathbf{r}}_n\right)\right]}^2}}}}, $$
(38b)
$$ {R}^{OI}\left(\sigma, {e}_o\right)=\sqrt{\frac{1}{N{P}_a{P}_b}{\displaystyle \sum_{r_a}{\displaystyle \sum_{r_b}{\displaystyle \sum_{n=1}^N{\left[{\psi}_a^{OI}\left({\mathbf{r}}_n,\sigma, {r}_a,{r}_b,{e}_o\right)-{\psi}_t\left({\mathbf{r}}_n\right)\right]}^2}}}}, $$
(38c)

which are compared to \( {\overline{R}}^{OSD}\left(\sigma \right) \) and R OSD(σ, e o ).

Fig. 9
figure 9figure 9

a The analysis field ca obtained by the OI data assimilation [see Eq. (9)] for “observations” shown in Fig. 6a various noise levels with various combinations of user-defined parameters (r a , r b, e o ,): (2, 2.5, 1), (4, 5.5, 1), (6, 8.5, 1), and (6, 8.5, 2). b. The analysis field ca obtained by the OI data assimilation [see Eq. (9)] for “observations” shown in Fig. 6b various noise levels with various combinations of user-defined parameters (r a , r b , e o ,): (2, 2.5, 1), (4, 5.5, 1), (6, 8.5, 1), and (6, 8.5, 2)

Figure 10 shows the comparison between R OI(σ, r a ) and \( {\overline{R}}^{OSD}\left(\sigma \right) \) for 5 different r a values: (2, 3, 4, 5, 6) and two types (the large-scale and small-scale) of the “observational” field. R OI(σ, r a ) monotonically increases with σ and is generally larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \). For the “observations” representing the large-scale eddy fields (L x  = 2, L y  = 3, see Fig. 6a), \( {\overline{R}}^{OSD}\left(\sigma \right) \)increases slightly from 0.32 for σ = 0 to 0.34 for σ = 2.0. However, R OI(σ, r a  = 2) is always larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \)and increases from 0.37 for σ = 0 to 1.13 for σ = 2.0; R OI(σ, r a  ≥ 3)is smaller than \( {\overline{R}}^{OSD}\left(\sigma \right) \)for small σ, equals \( {\overline{R}}^{OSD}\left(\sigma \right) \)at certain σ 0, and larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \)for σ > σ0. The value of σ 0 increases with r a from 0.4 for r a  = 3 to 1.0 for r a  = 6. R OI(σ, r a  = 6) increases from 0.13 for σ = 0 to 0.62 for σ = 2.0. For the “observations” representing the small-scale eddy field (L x  = 5, L y  = 7, see Fig. 6b), \( {\overline{R}}^{OSD}\left(\sigma \right) \)increases slightly from 0.22 for σ = 0 to 0.27 for σ = 0.4; evidently from 0.27 for σ = 0.4 to 0.40 for σ = 0.5; and slowly from 0.40 for σ = 0.5 to 0.71 for σ = 2.0. However, R OI(σ, r a ) is much larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \) for any r a . For example, R OI(σ, r a  = 2)increases from 0.43 for σ = 0 to 1.14 for σ = 2.0; …, R OI(σ, r a  = 6)increases from 0.89 for σ = 0 to 1.06 for σ = 2.0.

Fig. 10
figure 10

Comparison between R OI(σ, r a ) and \( {\overline{R}}^{OSD}\left(\sigma \right) \) of the analysis fields from the same “observations” with different noise levels with varying parameter r a  = (2, 3, 4, 5, 6) from top to bottom with the left panels using “observations” shown in Fig. 6a and the right panels using “observations” in Fig. 6b. The solid curves represent the OSD with the significance level of α = 0.05; and the dotted curves refer to the OI

Figure 11 shows the comparison between R OI(σ, r b ) and \( {\overline{R}}^{OSD}\left(\sigma \right) \) for 5 different (r b  − r a ) values: (0.5, 1.0, 1.5, 2.0, 2.5) and two types (large-scale and small-scale) of the “observational” fields. R OI(σ, r b ) monotonically increases with σ and is generally larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \). For the “observations” representing the large-scale eddy fields (L x  = 2, L y  = 3, see Fig. 6a), R OI(σ, r b  − r a ) monotonically increases with σ from around 0.2 for σ = 0 to around 0.78 for σ = 2.0 for all the values of (r b  − r a ) with σ 0 from 0.4 for (r b  − r a ) = 0.5 to 0.6 for (r b  − r a ) = 2.5. For the “observations” representing the small-scale eddy fields (L x  = 5, L y  = 7, see Fig. 6b), R OI(σ, r b  − r a ) is much larger than \( {\overline{R}}^{OSD}\left(\sigma \right) \) for any (r b  − r a ) and σ. For example, R OI(σ, r b  − r a  = 0.5) increases from 0.53 for σ = 0 to 1.00 for σ = 2.0; …, R OI(σ, r b  − r a  = 2.5)increases from 0.58 for σ = 0 to 1.00 for σ = 2.0.

Fig. 11
figure 11

Comparison between R OI(σ, r b ) and \( {\overline{R}}^{OSD}\left(\sigma \right) \) of the analysis fields from the same “observations” with different noise levels with different (r b  − r a ) = (0.5, 1.0, 1.5, 2.0, 2.5) with the left panels using “observations” shown in Fig. 6a and the right panels using “observations” in Fig. 6b. The solid curves represent the OSD with the significance level of α = 0.05; and the dotted curves refer to the OI

Figure 12 shows the comparison between R OI(σ, e o ) and R OSD(σ, e o ) for 5 different e o values: (0.2, 0.5, 1.0, 1.5, 2.0) and two types (large-scale and small-scale) of the “observational” fields. First, R OI(σ, e o ) monotonically increases with σ and is evidently larger than R OSD(σ, e o ) for all σ and e o . Second, dependence of R OSD(σ, e o ) on σ is insensitive to the change of e o . For the “observations” representing the large-scale eddy fields (L x  = 2, L y  = 3, see Fig. 6a), R OI(σ, e o ) is close to R OSD(σ, e o ) for σ < 1.2, and much larger than R OSD(σ, e o ) for σ > 1.2 with e o  = 0.2 and 0.5; and vice versa with e o  = 1.0, 1.5, and 2.0. R OI(σ, e o  = 2.0)increases slightly from 0.98 at σ = 0 to 1.08 at σ = 2.0 and is almost twice of R OSD(σ, e o ) for all σ. For the “observations” representing the small-scale eddy fields (L x  = 5, L y  = 7, see Fig. 6b), R OI(σ, e o ) is also larger than R OSD(σ, e o ). For example, R OI(σ, e o  = 2.0)increases slightly from 1.37 at σ = 0 to 1.42 at σ = 2.0, which is two to three times of R OSD(σ, e o  = 2.0) for σ < 1.0.

Fig. 12
figure 12

Comparison between R OI(σ, e o ) and R OSD(σ, e o ) of the analysis fields from the same “observations” with different noise levels with varying parameter e o  = (0.2, 0.5, 1.0, 1.5, 2.0) from top to bottom with the left panels using “observations” shown in Fig. 6a and the right panels using “observations” in Fig. 6b. The solid curves represent the OSD with the significance level of α = 0.05; and the dotted curves refer to the OI

The overall performance between OI and OSD with various noise levels (σ) can be estimated by the error ratio,

$$ \kappa \left(\sigma \right)=\frac{{\overline{R}}^{OSD}\left(\sigma \right)}{{\hat{R}}^{OI}\left(\sigma \right)},\kern0.75em {\widehat{R}}^{OI}\left(\sigma \right)\equiv \sqrt{\frac{1}{N{P}_a{P}_b{P}_e}{\displaystyle \sum_{r_a}{\displaystyle \sum_{r_b}{\displaystyle \sum_{e_o}{\displaystyle \sum_{n=1}^N{\left[{c}_a^{OI}\left({\mathbf{r}}_n,\sigma, {r}_a,{r}_b,{e}_o\right)-{c}_t\left({\mathbf{r}}_n\right)\right]}^2}}}}}. $$
(39)

Figure 13 shows the dependence of κ(σ) (evidently less than 1) on σ for the two types (large-scale and small-scale eddies) of the “observational” fields represented by Fig. 6a and b with two different significance levels (α = 0.05, 0.10) for the threshold of mode truncation in the OSD method (27). At α = 0.05 (Fig. 13a), for the large-scale eddy field, κ(σ) takes 0.71 at σ = 0; fluctuates with σ; and decreases to 0.57 at σ = 2.0. For the small-scale eddy field, κ(σ) increases monotonically with σ from 0.43 at σ = 0 to 0.67 at σ = 2.0. At α = 0.10 (Fig. 13b), for the large-scale eddy field, κ(σ) takes 1.17 at σ = 0; decreases monotonically with σ to 0.40 at σ = 2.0. For the small-scale eddy field, κ(σ) increases monotonically with σ from 0.36 at σ = 0 to 0.70 at σ = 2.0. It means that the OSD performs better for the test case. Integration of κ(σ) over the whole interval of the noise level [0, 2.0] yields

$$ \hat{\kappa}=\frac{1}{2}{\displaystyle \underset{0}{\overset{2}{\int }}\kappa \left(\sigma \right)d\sigma }=\left\{\begin{array}{ccc}\hfill \alpha =0.05\hfill & \hfill \alpha =0.1\hfill & \hfill \hfill \\ {}\hfill 0.76\hfill & \hfill 0.72\hfill & \hfill \mathrm{large}\hbox{-} \mathrm{scale}\ \mathrm{eddy}\hfill \\ {}\hfill 0.51\hfill & \hfill 0.59\hfill & \hfill \mathrm{small}\hbox{-} \mathrm{scale}\ \mathrm{eddy}\hfill \end{array}\right. $$
(40)
Fig. 13
figure 13

Dependence of the error ratio κ [see Eq. (39)] on σ using “observations” in Fig. 6a (represented by dots) and in Fig. 6b (represented by an asterisk) with two different significance levels: a α = 0.05, and b α = 0.10

which means that the overall error for the OSD is 76 % (51 %) of the OI error for the large-scale (small-scale) eddy field for α = 0.05. The overall performance of the OSD method is relatively insensitive to the selection of the significance level α.

The computational cost of the OSD and OI methods is comparable in the test cases. In the OSD method, the steep-descending method for mode truncation requires (a) the computation of a large number K b in Eq. (26) of eigenvectors and (b) the construction and solution of the OSD Eq. (13) can be done once for all. In the OI method, however, the construction and solution of the OI Eq. (9) must be repeated each time background/observations changes.

6 Synoptic monthly gridded temperature and salinity fields

The OSD method is used to to produce the synoptic monthly gridded (SMG) temperature (T) and salinity (S) datasets (Chu and Fan 2016a; Chu et al. 2016) from the two world ocean observational (T, S) profile datasets [the NOAA national Centers for Environmental Information (NCEI)’s World Ocean Database (WOD) and the Global Temperature and Salinity Profile Program (GTSPP)]. The synoptic monthly gridded absolute geostrophic velocity dataset (Chu and Fan 2016b) is also established from the SMG-WOD (T, S) fields using the P vector method (Chu 1995; Chu and Wang 2003). These datasets have been quality controlled by the NCEI professionals and are openly downloaded for public use at http://data.nodc.noaa.gov/geoportal/rest/find/document?searchText=synoptic+monthly+gridded&f=searchPage. The duration is January 1945 to December 2014 for the synoptic monthly gridded WOD (T, S) and absolute geostrophic velocity fields and January 1990 to December 2009 for the synoptic monthly gridded GTSPP (T, S) fields.

7 Conclusions

Ocean spectral data assimilation has been developed on the base of the classic theory of the generalized Fourier series expansion such that any ocean field can be represented by a linear combination of the products of basis functions (or called modes) and corresponding spectral coefficients. The basis functions are the eigenvectors of the Laplace operator, determined only by the topography with the same lateral boundary condition for the assimilated variable anomaly. They are pre-calculated and independent on any observational data and background fields. The mode truncation K depends on the observational data and a user input parameter \( {e}_o^2 \) (i.e., observational error variance); and is determined via the steep-descending method.

The OSD completely changes the common ocean data assimilation procedures such as OI, KF, and variational methods, where the background error covariance matrix B needs to be pre-determined since the weight matrix W is used. However, the OSD uses the spectral form to represent the observational innovation at the grid points [see (17)]. Minimization of the truncation error variance leads to the optimal selection of the spectral coefficients. Thus, the background error covariance matrix B vanishes in the OSD procedure since the weight matrix W is not used. It is contrast to the existing OI method, where the B matrix is often assumed to be stationary and homogeneous with user-defined parameters.

The capability of the OSD method is demonstrated through its comparison to OI using analytical 2D fields of large and small mesoscale eddies inside a domain with 4 rigid and curved boundaries as “truth”, and addition to the “truth” of white Gaussian noises with zero mean and standard deviations (σ) varying from 0 (no noise) to 2.0 with 0.1 increment at randomly selected locations used as “observations.” A simple covariance function (Bretherton et al. 1976) was used for the OI procedure with three user-defined parameters (r a , r b , e o ) taking 5 possible values each. The OSD uses the same value of e o . The performance of OSD and OI is compared by (1) patterns for each set of 125 combinations of parameters, (2) root mean square errors for varying parameters, and (3) overall root mean square errors. The results show that the overall error reduction using the OSD is evident, which is 76 % (51 %) [72 % (59 %)] for significance level α = 0.05 (α = 0.10) of the OI error for the large-scale (small-scale) eddy field. In context of practical application, synoptic monthly gridded world ocean temperature, salinity, and absolute geostrophic velocity datasets have been produced with the OSD method and quality controlled by the NOAA National Centers for Environmental Information (NCEI).

Two issues need to be addressed on the correlation matrix. First, the comparison between the OSD and OI is at one particular instant in time. The B matrix used in the OI is based only on distance. Second, in the covariance matrix-based methods, when the covariance matrix is fixed once and for all, it is well-known that the very first data assimilation cycle is doing well, but subsequent cycles are less effective because the remaining error has a tendency to be orthogonal to the directions of the covariance matrix. In the OSD method, the correction is based on spectral functions (i.e., basis functions) chosen once-and-for all. More sophisticated, flow-based covariance matrix will allow OI to perform much better. Further verification and validation under real-time ocean conditions are needed to verify the quality of OSD in time cycles and to compare between OSD and OI methods.

In the two test cases (large and small eddy fields), it is clear that the optimal mode truncation K OPT (around 6 for the large eddy field and around 60 for the small eddy field) are very closed to the number of eigenvectors required to represent the truth field (Fig. 4). This shows the capability of the steep-descending mode truncation. However, the performance of the method for the truth field is a mixture of large and small scales in different parts of the domains needs to be further investigated.