1 Introduction

Singular value decomposition (SVD) is a well-known mathematical tool that can be used to decompose an ensemble of velocity field data into spatiotemporal modes that may reveal coherent flow structures (Gentle 1998). Two special cases of the general SVD procedure are used in experimental fluid dynamics: Proper orthogonal decomposition (POD) is used if the data are un-correlated in time (as in a turbulent flow) (Holmes et al. 1996, 1997; Berkooz et al. 1993), and biorthogonal decomposition (BOD) is used if the data are correlated in time (as in a laminar flow) (Aubry 1991; Aubry et al. 1991).

Several recent experimental studies have employed POD or BOD to analyze particle image velocimetry (PIV) velocity field data: For example, POD was performed on PIV data of flow past a backward-facing step (Kostas et al. 2005), past a half-cylinder (Santa Cruz et al. 2005), in an internal combustion engine (Fogleman et al. 2004), and through an annular jet (Patte-Rouland et al. 2001). BOD was used to analyze PIV measurements of flow through a model of the human voice box (Neubauera and Zhang 2007), and BOD also was applied to experimental hot-wire anemometry velocity data in the study of the boundary layer on a rotating disc (Aubry et al. 1994). Recently, PIV-derived POD modes were used as a basis for direct numerical simulations of the flow past a circular cylinder by Ma et al. (2003). However, Ma notes, “the higher modes obtained from [POD of PIV velocity data] are noisy...”, and they employ a numerical method to work around these corrupted modes.

Aside from Ma’s work, this previous research has given little consideration as to how experimental error affects the results of POD or BOD. In theory, large-scale flow structures are captured by the lower decomposition modes, whereas small-scale flow structures are captured in higher modes. In practice, experimental PIV error may dominate higher modes, rendering them corrupted by noise. Typical sources of experimental error in PIV can include poor seeding density, high velocity gradients, and out of plane particle motion (Raffel et al. 2002).

Herein, we derive a threshold criterion that can be used to assess whether the magnitude of a decomposition mode is above the noise of the measurement. Modes that do not meet the criterion have a root mean square (RMS) velocity value that is less than the RMS measurement error. It is important to assess whether or not a mode meets our threshold criterion before it is interpreted physically (as in the above studies) or used as an input to a numerical simulation (as in Ma et al. 2003), because modes that do not meet the threshold may represent measurement error, not the physical flow phenomena of interest.

We apply our threshold criterion to the decomposition of PIV data of the wake behind a live swimming Giant Danio (Danio aequipinnatus). In our experiment, the flow evolves gradually between each frame of the high-speed PIV image sequence, so the measurements are correlated in time, and we employ the biorthogonal decomposition procedure. Generally, a two-dimensional slice of a carangiform swimming fish wake resembles a reverse Kármán street, which generates thrust to propel the animal through the water (Borazjani and Sotiropoulos 2008; Epps et al. 2009). The BOD of the fish wake reveals that the first four modes are similar to the modes of a regular Kármán street created in the wake of a stationary cylinder (Ma et al. 2000, 2003) and that higher modes are corrupted by experimental error.

In order to show that our error threshold criterion, which we develop for the general case of SVD, can be used to assess modes extracted by either BOD or POD, we present a brief synthesis of the relationship between SVD, BOD, and POD. We then proceed by deriving the error criterion and applying it to the experimental data from the swimming Danio.

1.1 Synthesis of SVD, POD, and BOD

Here, we briefly synthesize the mathematical relationship between SVD and its special cases, BOD and POD, in order to show that our error threshold criterion can be used to assess modes extracted by either BOD or POD, thus making it widely applicable to a range of experimental studies. The singular value decomposition (SVD) of a size [T, D] data matrix, X, is

$$ {\mathbf X} = {\mathbf u}\cdot {\mathbf s} \cdot {\mathbf v}^{\mathsf{T}} $$
(1)

where T and D are the number of time steps and data sites, respectively; matrices u, s, and v are size [T, T], [T, D], and [D, D], respectively; u contains the temporal eigenfunctions of X; v contains the spatial eigenfunctions of X; and s contains the singular values. That is, the SVD results in T modes (assuming DT, which is typical of PIV data), each consisting of a time-varying amplitude, a singular value (which represents the magnitude of the mode), and a spatial mode shape. The first few modes capture the primary dynamics of the flow, and small perturbations are captured in the higher SVD modes (Holmes et al. 1996).

Many additional definitions are required. Consider mode k: Its singular value is s k s(k,k) (note, all off-diagonal terms in s are zero). Its time-varying amplitude is given by the k th column of u, namely u(1:T,k). Its spatial mode shape is given by the k th column of v, namely v(1:D,k). The mode k data matrix is given by X k  = u(1:T,k) · s(k,k) · v(1:D,\(k)^{\mathsf{T}},\) which is in accord with (1) and is size [T, D]. By definition, each column of u and v is normalized such that its ℓ2-norm is unity. That is, \(\sqrt{\sum_{i = 1}^{{\mathsf{T}}} {\bf u}(i,k)^2} \equiv 1\) and \(\sqrt{\sum_{j = 1}^{D} {\bf v}(j,k)^2} \equiv 1.\) Thus, the magnitude of the mode is captured by its singular value. Also by definition, \({\bf u}^{{\mathsf{T}}} \cdot{\bf u} ={\bf I}\) and \({\bf v}^{\mathsf{T}} \cdot {\bf v} ={\bf I},\) where I is the identity matrix (Holmes et al. 1996).

1.1.1 SVD of temporally correlated data (BOD)

Consider first, the case when the T realizations of data are correlated in time, as in the present work. The method is to perform an SVD and then to analyze both the spatial modes, v, and temporal modes, u, for coherent structures and temporal regularities. In fluid dynamics, this method is referred to as the biorthogonal decomposition (BOD) (Aubry 1991; Aubry et al. 1991), and in other disciplines, the method of empirical orthogonal functions. In summary, BOD and SVD are mathematically synonymous.

1.1.2 SVD of temporally uncorrelated data (POD)

Consider now, the case when the T realizations of data are uncorrelated in time, as in a turbulent flow experiment. The method then is to perform an SVD and analyze the spatial modes, as scaled by their singular values:

$$ {\mathbf Y} = {\mathbf s}\cdot {\mathbf v}^{\mathsf{T}} $$
(2)

Since there is no correlation in time, the temporal eigenfunction matrix, u, has no physical meaning and is discarded. In fluid dynamics, this method is referred to as the proper orthogonal decomposition (POD) (Holmes et al. 1996, 1997; Berkooz et al 1993) or the method of snapshots or method of strobes (Sirovich 1987). In other disciplines, this procedure is also known as the Karhunen-Loève transform, principal components analysis, or method of empirical eigenfunctions.

The POD procedure is often presented as something distinct from SVD, but the following analysis shows that POD and SVD are actually equivalent, as implied by (2). The actual POD procedure is as follows: First, one forms the matrix of observed covariance between time steps, \({\bf R_T} ={\bf X} \cdot {\bf X}^{\mathsf{T}}\) (where R T is size [T, T]). Next, one solves the eigenvalue problem (R T ) · uu·s 2 T , which yields the size [T, T] eigenvector matrix u (which is identical to the SVD temporal amplitude matrix) and the size [T, T] eigenvalue matrix, s 2 T (which contains the squares of the T singular values of X. That is, \({\bf s_T^2} ={\bf s} \cdot {\bf s}^{\mathsf{T}}).\) Finally, one finds the POD modes, Y, by projecting the eigenvectors onto the data set by \({\bf Y} ={\bf u}^{\mathsf{T}} \cdot {\bf X}.\) Using (1), this POD procedure is equivalent to \({\bf Y} ={\bf u}^{\mathsf{T}} \cdot {\bf u} \cdot {\bf s} \cdot {\bf v}^{\mathsf{T}} ={\bf s} \cdot {\bf v}^{\mathsf{T}},\) as given in the SVD-based POD procedure (2). In summary, the POD procedure is equivalent to performing an SVD and discarding the temporal modes.

We have synthesized the BOD and POD procedures and shown that they both rely on the umbrella mathematical tool, the SVD. Thus, the threshold criterion developed in the following section to assess the validity of SVD modes can be applied to either POD or BOD analyses. In Sect. 4, we apply the threshold criterion to the BOD analysis of experimental PIV data of the wake behind a swimming fish.

2 Threshold criterion

We now present a threshold criterion for rejecting SVD modes obtained from experimental PIV data. The criterion that determines whether the magnitude of a mode is larger than the experimental error can be stated in three mathematically equivalent ways:

  1. (a)

    the root mean square (RMS) velocity of the mode is larger than the RMS PIV measurement error,

  2. (b)

    the signal to noise ratio is greater than unity,

  3. (c)

    the kinetic energy of the mode is greater than the kinetic energy of a hypothetical spatiotemporal velocity field, with normally distributed velocities that have zero-mean and a standard deviation equal to the RMS PIV measurement error.

To derive the criterion, first note that the velocity for mode k at time step i and measurement site j is X k (i,j) ≡ u(i,k)s(k,k)v(j,k). Thus, the root mean square (RMS) velocity of mode k is

$$ \begin{aligned}\hbox{RMS}_k &= \sqrt{{\frac{1}{DT}} \sum_{i = 1}^T \sum_{j = 1}^D\left( {\mathbf u}(i,k) {\mathbf s}(k,k) {\mathbf v}(j,k) \right)^2}\cr &= {\frac{{\mathbf s}(k,k)}{\sqrt{DT}}} \cdot \sqrt{ \sum_{i =1}^T {\mathbf u}(i,k)^2} \cdot \sqrt{ \sum_{j = 1}^D {\mathbf v}(j,k)^2} \cr &= {\frac{s_k}{\sqrt{DT}}} \end{aligned} $$

Suppose that the PIV velocity data have root mean square error of \(\epsilon\). Then (a) requires that the threshold criterion

$$ s_k > \sqrt{DT}\cdot\epsilon$$
(3)

be met for the magnitude of the mode to be larger than the experimental error. If (3) is not satisfied, then the magnitude of the mode is less than the measurement error, and the mode will be, as Ma et al. (2003) says, “noisy”, since it may be dominated by random PIV measurement error.

Our threshold criterion can also be viewed in terms of a signal to noise ratio. From this viewpoint, the singular value is a measure of the signal content of the mode; modes which do not satisfy (3) have low signal content and may be dominated by noise. The signal to noise ratio of mode k is, by definition, \(\hbox{SNR}_k \equiv \left[ {\frac{\hbox{RMS}_k}{\epsilon}} \right]^2 ={\frac{s_k^2} {DT \epsilon^2}}. \) Statement (b) requires that SNR k > 1, which is mathematically equivalent to (3).

A third way to view our threshold criterion is in terms of kinetic energy. The total kinetic energy (per unit density of fluid) of mode k is the sum of the kinetic energy of each of the D velocity values over all T-time-steps, which works out to be \(\hbox{KE}_k = {\frac{1}{2}} s_k^2.\) The kinetic energy of a hypothetical error velocity field with normally distributed velocities with zero mean and standard deviation, \(\epsilon\), is \({\frac{1}{2}} \epsilon^2DT\) (Venturi 2006) (this kinetic energy is also recovered in the hypothetical scenario of all velocities equal to \(\epsilon\)). Criterion (c) requires \({\frac{1}{2}} s_k^2 >{\frac{1}{2}} \epsilon^2 DT,\) which implies (3).

We suggest that the threshold criterion (3) be used as a rough limit of the validity of SVD modes extracted from experimental PIV data. Modes that do not satisfy (3) should be considered artifacts of the noise in the measurement and disregarded. Since we showed in Sect. 1.1 that both the BOD and POD procedures are equivalent to SVD, this threshold criterion applies to both BOD and POD analyses.

3 Materials and methods

3.1 Experimental details

To test the threshold criterion (3), we analyzed experimental data from a study with swimming fish (Epps et al. 2009). In this experiment, a Giant Danio (Danio aequipinnatus) was allowed to swim freely in a 15 cm by 30 cm tank, with 10-cm-deep water. The flow features were characterized using high-speed PIV (Raffel et al. 2002). The tank was seeded with silver-coated, neutrally buoyant, hollow glass spheres (average diameter 93 μm). The particles were illuminated using a low-powered, near-IR diode laser. The Lasiris Magnum diode laser produced a maximum output of 500 mW at 810 nm and was fitted with optics to produce a 10° fan of light. The horizontal light sheet was imaged using an IDT XS-3 CCD camera with an 85 mm Nikkon lens, which viewed up from the bottom of the tank, as illustrated in Fig. 1a. The high-speed camera captured 8-bit-depth images at 100 frame/s, with a resolution of 1,280 × 864 pixels. The field of view was 19.94 × 13.46 cm, giving a 64.2 px/cm zoom.

Fig. 1
figure 1

a Experimental PIV setup used for swimming study. The high-speed camera viewed up through the bottom of a glass aquarium, and the laser sheet was oriented horizontally at the mid-plane of the fish. b Qualitative illustration of the vorticity field in the last time step. Clockwise vorticity are shown in blue, and counterclockwise vorticity are shown in red. The fish tail is shown in black. The fish swam from right to left at nearly-constant speed and made two nearly-identical tail flaps

The time series of particle images were processed using the multi-pass cross-correlation algorithm in the LaVision DaVis 7.2 software package. In the first pass, interrogation windows at the same location in each of two successive PIV images were cross-correlated to give an estimate of the particle displacements in that window. In the second pass, the window from the second image was centered at the displacement position estimated in the first pass, increasing the accuracy of the cross-correlation. In this experiment, we performed the first pass with 64 × 64 px interrogation windows and the second pass with 32 × 32 px windows, with 50% overlap in adjacent windows for higher resolution velocity fields. The output was velocity fields of 80 × 54 vectors, with approximately 40 vectors along the length of the fish body. For this time-series data, the time step between velocity fields was the same as the time between PIV image frames, ▵t. All data post-processing, including wake interrogation, data smoothing, and the BOD analysis, was performed in Matlab.

Each time the fish swam steadily through the field of view with the laser at its mid-plane, a time series of images was saved. Results presented herein are for an adult fish (which had an overall length of L = 9.89 cm) swimming steadily at a speed of U = 9.49 cm/s = 0.96 L/s, which corresponds to a Reynolds number of Re = UL/ν = 9300. Three successive tail beats were observed T = 38 frames apart, yielding a constant flapping frequency of f = 2.6 Hz. The average tail flap amplitude was H = 1.38 cm = 0.14 L, which corresponds to a Strouhal number of St = fH/U = 0.37.

In this particular case, the fish made two nearly-identical tail flaps, so these data were selected for the example BOD analysis presented herein. The tail flaps were biased to the lower end of the page, resulting in an asymmetric wake. The wake is qualitatively illustrated in Fig. 1b, where instantaneous vorticity contours show that its configuration is a reverse Kármán street.

3.2 Wake interrogation

Technically, the BOD procedure does not require the spatial location of each measurement to have a ‘fixed identity’ (i.e., a fixed location relative to the fish). However, in order to facilitate interpretation of the BOD modes and to be able to compare them to the modes of a regular Kármán street formed behind a stationary cylinder computed by Ma et al. (2003), we must interrogate the wake in a body-fixed frame of reference.

Outlines of the first and last wake interrogation grid are shown in Fig. 2a. The fish trajectory was determined by locating the position of the caudal fin fork at each extrema in tail lateral excursion and fitting straight lines via least squares through these points, with the average of these two fits giving the centerline of the trajectory. The origin (x = 0) locations of the grids for the first and last time steps were located by perpendicularly projecting the tail extrema positions onto the trajectory centerline, as shown. Intermediate wake grids were equispaced along the trajectory, which yielded a body-fixed interpolation, since the fish swam at steady speed. Interpolation was performed using the Matlab command \( {\tt griddata (\ldots,{\texttt{`cubic'})}}, \) which performs a triangle-based cubic interpolation (MathWorks 2009). The raw PIV velocity field was cropped (as shown in Fig. 2a) in order to expedite the interpolation procedure, which does not use data far away from the interpolation sites anyway. These interpolated wake data were projected into the wake coordinate system to find wake-aligned velocity components, V x and V y . The translation velocity of the wake grid was ignored in the interpolation procedure, since it is constant and would, therefore, be removed prior to BOD anyway. By interpolating on a moving grid, we ‘collected’ PIV data in a body-fixed reference frame.

Fig. 2
figure 2

Illustration of data processing: a Wake interpolation schematic: tail maximum excursion positions, filled circle; swimming trajectory, dashed line; and outline of wake interpolation grid, solid line. The period of tail flapping is 38 frames (f = 2.6 Hz). b Data smoothing: interpolated PIV data, filled square, are phase-averaged to yield the “noisy” data, filled circle; smoothing spline values, dashed line, are phase-averaged to yield the “smoothed” data, plus. The data shown are V y (x ≈ S L , y ≈ 0)(t)

3.3 Experimental PIV error

The two primary sources of PIV measurement error are the loss of in-plane particle pairs and error due to large velocity gradients. Other factors such as particle image diameter and particle image displacement have lesser effects (Raffel et al. 2002; Melling 1997). Adequate seeding density (∼15 particles per interrogation window) and a high frame rate can mitigate the loss of in-plane particle pairs, but PIV velocity measurement error due to velocity gradients persist. In our experiment with the Danio, the “noisy” data had an RMS velocity gradient of 0.02 [(px/▵t)/px], where ▵t is the time step between successive images, which according to Raffel et al. (2002) corresponds to an RMS PIV velocity error of approximately

$$ \epsilon \approx 0.1\,[\hbox{px}/\Updelta\hbox{t}]. $$
(4)

In the present experiment, the number of data sites is D = 1,776, and the number of time steps is T = 38; therefore, the threshold singular value (3) is

$$ s_k > \sqrt{1776 \cdot 38}\cdot 0.1\,[\hbox{px}/\Updelta\hbox{t}] = 26.0\,[\hbox{px}/\Updelta\hbox{t}] $$
(5)

Note that by definition, singular values contain the units of the original data. Since we presently consider a decomposition of velocity fields, the units of the singular values are [px/▵t]. We chose not to normalize the singular values (say, by the swimming speed), since it is easier to compare singular values to PIV error using the raw units.

3.4 Data smoothing

In order to assess the effect of measurement error on the BOD, we must compare the interpolated PIV data to a set of data in which the error has been removed. In the high-speed PIV experiment, the flow appears to evolve smoothly to the naked eye, but—presumably due to measurement error—the PIV measurements deviate from otherwise smooth trajectories. To find the trajectory of each measurement, we fit a smoothing spline to these data using the Matlab function \(\texttt{spaps}\) (MathWorks 2009). For example, Fig. 2b shows the transverse velocity at approximately one stride length downstream of the tail, V y (S L , 0)(t), where noisy PIV data follow the trajectory fit by the smoothing spline.

In the “noisy” and “smoothed” data sets presented herein, each trajectory is phase-averaged, as shown in Fig. 2b. Phase averaging eliminates minor transient differences between the flapping cycles and allows us to perform BOD on one flapping cycle worth of data. (The authors have also performed BOD without phase averaging and found that the first two BOD modes are nearly identical to those of the phase-averaged data, confirming quantitatively that the flow is periodic.) Finally, the “error” data set presented herein was constructed by taking the difference between the “noisy” and “smoothed” data. For each data set (“noisy”, “smoothed”, and “error”), a time series of velocity fields capturing one flapping cycle was input to the BOD analysis.

4 Results and discussion

4.1 PIV wake data

The “noisy”, “smoothed”, and “error” time series of wake velocity data are shown in Fig. 3. Since the translation velocity of the wake grid was ignored in the data interpolation, these vectors represent deviation from the free-stream in this body-fixed reference frame. The “noisy” and “smoothed” time series both represent a reverse Kármán street, which is the well-known wake structure of a carangiform swimming fish (Nauen and Lauder 2002; Borazjani and Sotiropoulos 2008). The velocity and vorticity fields of the “error” data have much smaller magnitude than the “noisy” data, and they appear random, indicating that the “error” data is in fact random PIV measurement error.

Fig. 3
figure 3

Filmstrips of velocity field data input to the BOD analysis. Points of interest: swimming centerline, y = 0; tail flap envelope, y = ±0.07L; tail flap plane, x = 0; and stride length, x = 0.37L

The fish swam with nearly constant speed, but had a bias in its kick toward the bottom of the image. Thus, the vortices on the bottom side of the wake have higher vorticity levels than those on the top, and all vortices drift in the negative y direction as they convect downstream (due to self-induction of the wake).

The time-average of these V(x, y, t) velocity data is

$$ {\mathbf V}^{\rm avg}(x,y) = {\frac{1}{T}} \sum_{i=1}^T {\mathbf V}(x,y,t_i) $$
(6)

The time-averaged wake is a jet which grows in both magnitude and breadth as it progresses downstream, as shown in Fig. 4a. The maximum streamwise velocity occurs at approximately one stride length downstream of the tail flapping plane.

Fig. 4
figure 4

Time-averaged wake velocity and vorticity fields. The time-average of the “smoothed” and “noisy” data are nearly identical, so the time-averaged “noisy” data is not shown. The time-averaged fields for the “error” data are nearly zero. The vector scale and vorticity colormap are the same for both figures

The time-averaged “error” data are shown in Fig. 4b. All velocity values are nearly zero; the mean “error” velocity is 0.001 [px/▵t]. For comparison, the mean velocity value of the “smoothed” data is 0.19 [px/▵t], nearly two hundred times greater than the mean “error” velocity value. Figure 4 indicates that the error data is random noise and that the signal to noise ratio of our measurements is quite high.

4.2 Singular values

We now present the results of the biorthogonal decomposition (BOD) of the “noisy”, “smoothed”, and “error” data sets. In order to perform the BOD, the velocity component data (which each are of size [N = 37, M = 24, T = 38]) are formatted into matrix X, which is of size [T = 38, D = 2MN = 1776]. Each column of X contains the T measurements made at a particular data site, less their time-averaged value, which was computed using (6). The BOD was performed using the Matlab command \(\texttt{svd}\) (MathWorks 2009).

Figure 5a shows the singular values of the BOD modes, as well as a dashed line showing our threshold criterion (5), which is \(s_k >\sqrt{DT} \cdot {\epsilon}=26.0\,[\hbox{px}/\Updelta\hbox{t}].\) For both the “noisy” and the “smoothed” data, singular values 1 and 2 are approximately three times the error threshold and also three times larger than the next singular values. This implies that modes 1 and 2 capture most of the flow dynamics (i.e., they contain most of the kinetic energy). Singular values 3 and 4 are nearly equal to our error threshold criterion. In both the “noisy” and “smoothed” cases, singular values 5–38 are lower than the threshold; these modes may be contaminated by measurement noise.

Fig. 5
figure 5

a BOD singular values for the “noisy” data, filled circle, and “smoothed” data, plus. The dashed line represents the PIV error criterion (3); modes 5–38 should be ignored, since their singular values are less than this threshold. (b) Singular values of the “error” data, filled square, and singular values of a size [T, D] matrix of Gaussian random numbers with standard deviation μ = 0.14 [px/▵t], filled diamond. The dashed line represents the PIV error criterion (3), and the dash-dotted line represents the expected singular values of a size [T, D] matrix of Gaussian random numbers with standard deviation, \(\epsilon\) = 0.1 [px/▵t]

Note that we do not imply that modes 5–38 only contain noise. Rather they must contain some of the signal, but since their magnitude is so small (and so little signal is left for them to capture), they are most likely dominated by measurement noise. Likewise, mode 1 must contain some measurement noise, but this is likely a small fraction of the signal content in this mode.

The singular values of the “error” data, shown in Fig. 5b, are all less than our threshold criterion, which indicates that the “error” data does not contain appreciable signal content. In addition, these singular values span only one order of magnitude; that is, compared to the singular values of the “smoothed” data, the singular values of the “error” data are nearly constant.

To appreciate this fact, consider a size [T, D] random data matrix, X rand, populated by normally distributed random numbers with zero mean and standard deviation, \(\epsilon\). Since there is no coherent signal in such a matrix, all T singular values will be equal \((\hbox{assuming}\, D \ggg T \gg 1).\) To find their value, s rand, note that the total kinetic energy of the data is

$${\frac{1}{2}} \sum_{i = 1}^T \sum_{j = 1}^D {\bf X}_{\rm rand}(i,j)^2 = {\frac{1}{2}}DT \epsilon^2 $$

and also is given by

$$\sum_{k=1}^T {\frac{1}{2}} s_k^2 = {\frac{1}{2}} T s_{\rm rand}^2.$$

Therefore,

$$ s_{\rm rand} = \sqrt{D}\epsilon $$
(7)

which is clearly less than the threshold, \(\sqrt{DT}\epsilon.\) Many of the singular values of the “error” data are approximately \(\sqrt{D} \cdot \epsilon = 4.2\,[\hbox{px}/\Updelta\hbox{t}],\) which supports the claim that these data are random noise.

One may argue that \(s_k > \sqrt{D}\epsilon\) should be the error threshold criterion, instead of (3). However, since (7) is the limit of no signal, it seems that modes for which \(\sqrt{D}\epsilon < s_k < \sqrt{DT}\epsilon\) may still have significant noise content. Hence, (3) is our suggested threshold criterion.

The RMS of the “error” velocity data actually is μ = 0.14 [px/▵t], which is slightly higher than the estimated RMS PIV error of \(\epsilon\) ≈ 0.1 [px/▵t]. In hindsight, finding the RMS of the “error” velocity values may be a more accurate way of estimating PIV error than the arguments offered in Sect. 3.3. Nevertheless, the results presented herein hold whether \(\epsilon\) ≈ 0.1 or 0.14 is used. F5

or reference, the (nearly-constant) singular values of a size [T, D] matrix of Gaussian random numbers with standard deviation, μ = 0.14 [px/▵t], are shown in Fig. 5b.

4.3 Mode shapes and amplitudes

Figures 6, 7, and 8 show BOD modes 1–10 for the “noisy”, “smoothed”, and “error” data, respectively. Each BOD mode consists of a normalized temporal amplitude, normalized velocity field mode shape, and a singular value. In Figs. 6, 7, and 8 vorticity fields are shown, which were computed from the modal velocity fields; also, the temporal amplitudes shown are scaled by their respective singular values, in order to show the magnitude of each mode explicitly.

Fig. 6
figure 6

BOD modes 1–10 for the “noisy” data. Vorticity fields shown were computed from BOD velocity mode shapes; amplitudes shown were scaled by their respective singular values

Fig. 7
figure 7

BOD modes 1–10 for the “smoothed” data. The vector scale and vorticity colormap are the same as Fig. 6

Fig. 8
figure 8

BOD modes 1–10 for the “error” data show no coherent signal content. The vector scale and vorticity colormap are the same as Fig. 6

Consider first the BOD amplitudes of the “noisy” data, shown in Fig. 6. Amplitudes 1 and 2 are approximately sinusoidal, which is expected since the tail motion (and thus, the fluid forcing) was approximately sinusoidal. Because the flow is periodic, we expect the BOD modes appear in pairs, similar to the sine and cosine modes of a Fourier decomposition. Indeed, BOD amplitudes 1 and 2 have frequencies of approximately the tail flapping frequency, f, and amplitudes 3 and 4 have frequencies of approximately 2f. However, BOD amplitudes 5 and higher (5+) are quite noisy and do not appear sinusoidal.

In sync with the temporal-frequency doubling of the amplitudes, vorticity fields 1–4 also display a spatial-frequency doubling. Vorticity fields 1 and 2 display one clockwise and one anticlockwise vorticity patch within the first stride length downstream of the tail; in other words, modes 1 and 2 have the same spatial frequency as the original data, which resembled a reverse Kármán street (see Fig. 3). Modes 3 and 4 contain two cycles of vorticity within the stride length, which corresponds to twice the spatial frequency of the original data. Modes 5+ again break the mould; they do not repeat the pattern of frequency doubling that we would expect in a Fourier decomposition.

Modes 5+ should be ignored since neither their temporal amplitudes nor their spatial mode shapes show the expected frequency doubling observed in prior modes; instead, these modes appear noisy and random. According to our error threshold criterion (3), modes 5+ have a magnitude lower than the PIV measurement error and should be ignored, since they may be contaminated by measurement noise. Amplitudes 5+ shown in Fig. 6 do not evolve smoothly in time, as this flow appears to the naked eye. Vorticity fields 8–10 show no coherent structures. Although vorticity fields 5–7 appear to contain a coherent pattern of vortical patches, these could be due to some actual dynamics of the flow or could very well be indicative of some systematic PIV error, which appears more often in regions of high shear along the centerline of the wake. In any case, modes 5+ should be considered contaminated by measurement noise and ignored in data reconstructions or further dynamic analyses.

One might be concerned about whether the PIV spatial resolution is fine enough to resolve the small vortical structures expected to appear in modes 5 and higher. If modes 5 and 6 did contain information about the flow, then it would be expected that their spatial frequencies would be three times that of modes 1 and 2 (i.e., six vortices per stride length). Thus, the diameter of the vortices expected to appear in modes 5 and 6 would be d = (0.37L)/6 = 0.06L. The PIV spatial resolution was 16 px = 0.025L, which should be fine enough to resolve these vortices. However, modes 5 and 6 do not show such a vortical pattern.

One final point of interest in Fig. 6 is that the general form of vorticity fields 1 and 2 is similar to that of the decomposition of a Kármán street formed in the wake behind a circular cylinder (Ma et al. 2000). This is expected, since the fish wake is a reverse Kármán street.

Consider now the amplitudes of the temporally smoothed data, shown in Fig. 7. In the processing of these data, the trajectory of each data site (e.g., V y (S L , 0)(t)) was smoothed in time by fitting a smoothing spline to the data; thus, the temporal fluctuation of each data site was removed, but no spatial smoothing was performed. As a result, the BOD amplitudes shown in Fig. 6 evolve quite smoothly in time, whereas the vorticity fields contain the noise of this data set.

All amplitudes approximate sinusoids: amplitudes 1 and 2 have frequency, f; 3 and 4, 2f; 5 and 6, 3f; and so on, which is in agreement with the expected Fourier result. However, only vorticity fields 1–4 display the expected spatial frequencies. As with the “noisy” data, “smoothed” data modes 5–7 do not display the expected spatial frequency, and modes 8–10 show no coherent pattern.

Finally, the BOD modes of the “error” data are shown in Fig. 8. None of the temporal amplitude signals show a coherent pattern. Spatial modes 2 and 3 of the “error” data have alternating vortical patches along the centerline of the swimming trajectory, similar to spatial modes 5–7 of the “noisy” data. This implies that the PIV error that corrupted “noisy” modes 5–7 is being captured by “error” modes 2 and 3. The other “error” mode shapes appear to be random noise, indicating that “error” data is primarily random measurement error.

4.4 Wake reconstruction

Since the present PIV experiment is only capable of resolving the first four BOD modes, one may ask if these modes are sufficient to reconstruct the fish wake. In general, one constructs a rank r approximation of X (called a Galerkin approximation) by summing the first r BOD modes, \({\bf X}^{(r)} = \sum_{k=1}^r \) X k  = u(1:T,1:r) · s(1:r) · s(1:r,1:r)· v(1:D,1:\(r)^{\mathsf{T}}\), The original data matrix X can be recovered by summing all T modes, X = ∑ Tk=1 X k , which is just a restatement of (1).

Only the first two BOD modes are needed in order to reconstruct the vortex street behind the swimming fish. Since the amplitudes of modes 1 and 2 are shifted temporally and their vorticity fields are shifted spatially, modes 1 and 2 can represent the formation and convection of vortices into the wake behind the fish, as shown in Fig. 9a. This is expected, since other researchers have found that only modes 1 and 2 are needed to reconstruct the (regular) Kármán street behind a circular cylinder (Ma et al. 2000). These modes capture 90.7% of the kinetic energy of the “noisy” time series of data.

Fig. 9
figure 9

BOD low-order reconstructions of the “noisy” data. The vector scale and vorticity colormap are the same as Fig. 3

Figure 9b illustrates that modes 3 and 4 add further detail to the shape and strength of the vortices in the reconstructed wakes. Since modes 1 through 4 capture 96.1% of the kinetic energy of the original velocity fields (i.e., they contain most of the signal content), many of the snapshots shown in Fig. 9b look virtually identical to the original “noisy” data (shown in Fig. 3a). The inclusion of modes 6–10 in the reconstruction (not shown) yields even better agreement with the “noisy” data, but since the signal strength of modes 5 through 10 is significantly lower than our error threshold criterion, we anticipate that these modes only serve to reintroduce the PIV error back into the reconstructed solution.

5 Conclusions

In this work, we developed a threshold criterion (3) for rejecting singular value decomposition (SVD) modes. This threshold criterion can be interpreted in three ways:

  1. 1.

    the root mean square (RMS) velocity of the mode is larger than the RMS PIV measurement error,

  2. 2.

    the signal to noise ratio is greater than unity,

  3. 3.

    the kinetic energy of the mode is greater than the kinetic energy of a hypothetical spatiotemporal velocity field, with normally distributed velocities that have zero-mean and a standard deviation equal to the RMS PIV measurement error.

Further, we showed that since both biorthogonal decomposition (BOD) and proper orthogonal decomposition (POD) are rooted in the SVD, this threshold criterion applies to both types of analyses.

Herein, we performed a BOD analysis of 2D PIV data of a fish wake. We obtained ‘body-position-invariant’ velocity fields by interpolating the PIV data on a grid that translated with the swimming fish. We have shown that the first two BOD modes can represent the fish wake, which is a reverse Kármán street. Modes 3 and 4 add detail to the wake, whereas modes 5 and higher add little additional information and contain much of the measurement noise. As expected, only the first four modes obtained from our experimental data had magnitudes that met the threshold criterion. We found that smoothing our PIV data using smoothing splines has little effect on these first four BOD modes. However, since the smoothing removes PIV error, the magnitudes of higher BOD modes was reduced.

Some parallels can be drawn between the present analysis and the results presented by Ma et al. (2003) regarding the POD modes of a cylinder wake, which is a regular Kármán street. They compare POD modes extracted from (noisy) PIV “experimental” data to those extracted from (precise) direct numerical simulation “DNS” data. (Unfortunately, they do not report the number of velocity vectors in their experiment nor do they report the units of their singular values, so we cannot verify if our threshold criterion (3) is valid in their case.) Similar to our results, Ma finds that the “experimental” data eigenmodes 1–4 were acceptable, but modes 5 and higher were corrupted by measurement error. In both our “noisy” data and Ma’s “experimental” data, the singular values of modes 5 and higher (5+) are clustered within one order of magnitude, whereas for our “smoothed” data and Ma’s “DNS data”, these singular values span six orders of magnitude. Since a matrix of random data has constant singular values, we assert that in comparison to the singular values of our “smoothed” and Ma’s “DNS” modes 5+ , the singular values for our “noisy” and Ma’s “experimental” modes 5+ were relatively constant, indicating that these modes are capturing the experimental error.

These points taken together indicate that Kármán streets can be represented adequately with two to four BOD modes and that experimental error must be very precisely controlled if higher modes are desired.