1 Introduction

Since the idea of digital particle image velocimetry (DPIV) was introduced, research work has been focused on developing different kinds of cross-correlation (CC) based interrogation methods for velocity vector field computation. For the computation of CC, typically, a fast Fourier transform (FFT) is applied between two image-pair sub-regions, called interrogation areas (IA). The highest correlation peak from the origin describes the average displacement of the particles inside the IA with an accuracy of 1 pixel. The described procedure is the basic form of the FFT-CC algorithm. To increase the velocity estimation accuracy, the correlation values around the highest peak are used in the sub-pixel estimation procedure (e.g., Lourenco and Krothapalli 1995; Forliti et al. 2000; Roesgen 2003). The sub-pixel estimation is a widely researched topic. In this paper, a simple three-point Gaussian fitting for sub-pixel estimation is used with all the algorithms to ensure that they are comparable with each other.

In some cases, the particles move long distances and cannot be found in the second exposure IA. The problem of loss-of-pairs and the displacement of the second IA is studied by e.g., Adrian (1991) and Keane and Adrian (1992). Considering the problem, the need for the displacement compensation caused by the in-plane movements leads to the suggestion of pre-computing the data to determine the initial displacement for the next cycle. This is called an iterative technique or, alternatively, a multi-pass technique. Typically, the computation is started with a larger IA than in the final computation cycle. The use of iterative CC algorithms coupled with a refinement of the size of the IAs is a typical example of the technique demonstrated by e.g., Scarano and Riethmuller (1999). Different variations of this algorithm are popular among PIV users, and, in this paper, the algorithm’s basic form is called iterative FFT-CC with discrete window shift (DWS).

The other approach to the problem of in-plane movements is an algorithm called direct normalized cross-correlation (DNCC), which is computed directly between the IAs. In this approach, the IA of the second exposure can be set as large as needed in order to cover all the particle displacements, and, thus, the initial displacement iterations are unnecessary. In the literature, the DNCC algorithm is also called correlation image velocimetry by Fincham and Spedding (1997) and by Fincham and Delerce (2000), and direct correlation PIV by Nogueira et al. (2001). DNCC can also be implemented via FFT, which significantly decreases the computational load (Ronneberger et al. 1998). The advantage of DNCC is that it decreases the mean bias and rms errors inherent in the DWS method, and it is used by e.g., Piirto et al. (2003). It is also possible to develop a hybrid form of DNCC. In that case, the pre-computation is performed by a standard FFT-CC or DWS algorithm, and only the correlation peak vicinity, typically, an area of 3×3 pixels or 5×5 pixels, is calculated by DNCC, such as the methods used by Piirto et al. (2001a) and McKenna and McGillis (2002). If the integer value of the correlation peak with DWS and DNCC is the same, the accuracy of the DNCC hybrid version is exactly the same as with DNCC, which is the third algorithm used in the current research.

In the fourth algorithm of this paper, called the FFT-CC with continuous window shift (CWS) technique (Lecordier et al. 1999), the IA displacement between the first and the second exposure includes a sub-pixel offset in addition to the discrete offset. In the CWS algorithm, the image interpolation is needed for the compensation of the sub-pixel offset. The basic idea of the CWS algorithm is that the evaluation error decreases when the iteration loops are increased. The image interpolation opens the question of the optimum interpolation technique, and the bilinear and the cardinal function interpolation are tested in the current study. It must be noted that the final velocity estimate with these iterative methods depends on three factors: the discrete shift, the image interpolation, and the Gaussian peak fitting of the last loop. However, the mean bias error inherent in bilinear interpolation cannot be fixed by increasing the number of loops. Thus, a compensation method developed for CWS bilinear interpolation, based on the particle image size analysis, is introduced and tested in this paper.

Iterative CWS with image deformation (CWD) is the next logical step from the linear shift CWS algorithms, as introduced by Huang et al. 1993. In this method, the window is deformed after each iterative loop according to the intermediate velocity results. This is leads to increased signal-to-noise ratio of the correlation peak, and it is useful, especially with regions where the velocity gradient introduces a significant velocity difference on the edges of each interrogation window. CWD algorithms with different image interpolation techniques are discussed by e.g., Scarano and Riethmuller (2000) and Scarano (2002). The convergence of this method is not automatic, and, after several iterations, the algorithm may have unstable features and the residuals start to increase (Nogueira et al. 2001). The converged results can be reached in a few iterations if the intermediate velocity fields are spatially filtered with a moving average filter, but the best resolution is then lost (Scarano 2004). In this paper, the effect of the spatial filter size is tested with turbulent boundary layer (TBL) data.

The main goal of this paper is to compare conventional and iterative PIV interrogation algorithms with simulated and real data sets. The tests performed within the paper are collected in Table 1. The emphasis of the study is on turbulence statistics. Mean velocity, rms, and Reynolds shear stress estimates are compared with TBL direct numerical simulation (DNS) results of Moser et al. (1999) and Kawamura (1994). The structure of the paper is as follows: the accuracy of the PIV interrogation algorithms is tested with simulated data in Sect. 2; CWS image interpolation methods are discussed in Sect. 3; the deformation algorithm is introduced in Sect. 4; the computational performance of the algorithms is analyzed in Sect. 5; the tests with real data sets are performed in Sect. 6; and, finally, the conclusions are given in Sect. 7.

Table 1 Algorithm tests of this paper

2 Error analysis

The definition and analysis of errors inherent in PIV is a challenging subject of study, and most of the PIV methodology papers include quite comprehensive error analyses. Here, we use the following definition for PIV evaluation errors. The first error is called the mean bias and it is defined for a statistically large enough number of velocity fields. For N velocity fields in a particular position (i, j), it is defined as:

$$ E_{{{\text{mb}}}} {\left( {i,j} \right)} = \frac{1} {N}{\sum\limits_{\xi = 1}^N {\Delta D_{\xi } {\left( {i,j} \right)}} } $$
(1)

The rms error is defined by:

$$ E_{{{\text{rms}}}} {\left( {i,j} \right)} = {\sqrt {\frac{1} {N}{\sum\limits_{\xi = 1}^N {{\left[ {\Delta D_{\xi } {\left( {i,j} \right)}} \right]}} }^{2} } } $$
(2)

The displacement error ΔD=DPIVDTRUE is an error between the true and the measured displacement. If the discrete part of the window displacement can be determined to within 0.5 pixels, the accuracy of the velocities after the final iteration with the DWS algorithm is typically on the order of 0.04 pixels (Westerweel et al. 1997), and the theoretical analysis of the measurement error is carried out by Westerweel (2000). In Figs. 1 and 2, the mean bias and rms errors are plotted for all algorithms with a mean particle size (mps) of 1.5 pixels, standard deviation (std) of 1.0 pixels, and 15 particles per IA of 32×32 pixels (particles per pixel, ppp=0.0147). In the CWS algorithm (Lecordier et al. 1999), both the bilinear and cardinal function methods are used in the image sub-pixel interpolation.

Fig. 1
figure 1

Mean bias errors in pixels with numerically generated PIV images as a function of the fractional displacement between 0 pixels and 1 pixel (mean particle size (mps)=1.5 and standard deviation (std)=1.0)

Fig. 2
figure 2

Rms errors in pixels with numerically generated PIV images as a function of the fractional displacement between 0 pixels and 1 pixel (mps=1.5 and std=1.0). For legend, see Fig. 1

Because the pre-computation phase is easy, the first iteration is performed with an IA of 64×64 pixels, and the second iteration is performed with an IA of 32×32 pixels, and is marked as DWS1plus1. The idea of DWS pre-computation is also used with the real data sets for all the CWS and CWD methods. The DWS algorithm is stable and fast, and, in many normal measurement cases, it offers excellent input data for the CWS and CWD methods. The mean bias and rms errors of the standard FFT-CC and DWS methods are in good agreement with the previous studies, such as Westerweel et al. (1997) and Scarano and Riethmuller (1999).

The cardinal function interpolation is implemented like e.g., that by Scarano (2002), but without Hamming or Blackman windowing. The bilinear interpolation method is implemented in an iterative form, similarly to Gui and Wereley (2002). Only three iterations will lead to converged results, and also, the rms error is at a minimum. In the following figures, the three iterative loops with the CWS algorithm are marked as CWS3, and the mean bias and rms error have almost the same values and shapes with the mean bias and total measurement error calculated by Gui and Wereley (2002) for synthetic data of Gaussian shape particle images and random noise.

As can be noticed, the bias error of the CWS tests with bilinear interpolation is on the order of the bias error of the DWS algorithm, but the error has an opposite phase. Mean bias errors can be reduced to the same order as in DNCC by using CWS with cardinal function interpolation; there is a diminutive evaluation bias caused by in-plane movements. Also, the rms error is on the same order of magnitude as both the DNCC and CWS with the cardinal interpolation methods. These error estimates for DNCC and CWS with cardinal function interpolation are in good agreement with the simulations by Fincham and Delerce (2000) and Scarano and Riethmuller (2000), respectively.

3 CWS image interpolation

Image interpolation is the most challenging part of the CWS algorithm. The size distribution of the particle images is strongly related to the error of some PIV interrogation methods, while, in contrast, the other methods, like DNCC, are more robust to the particle image sizes. CWS with cardinal interpolation, especially with a stencil size of 11×11 pixels, gives low mean bias error and rms error values, as shown in Figs. 1 and 2. CWS with bilinear interpolation is a very fast algorithm compared to the CWS cardinal algorithm, but has relatively strong mean bias error, especially with some particle image size distributions. It is suggested that this error is compensated for in each image interpolation phase, and, thus, a bias error compensation procedure is developed. It is based on particle image size analysis of some real PIV image samples. First, the mean particle image size is analyzed, e.g., either by a correlation method or by a so-called region growth method (Conzalez and Woods 2002). Then, a data set of artificially generated PIV images of the known displacements, having the same mean particle image size as the real data set, is used to analyze the mean bias error of the CWS bilinear algorithm. In the correlation method, the shape of the correlation peak is related to the particle image size, and its calibration is performed with synthetic data of the known particle sizes without particle image size variation. The correlation method is more robust to the higher density of the particle images per interrogation area than the region growth method. On the other hand, the region growth method is more straightforward and works fine, especially with small particle images also giving information on the particle size distribution. The particle image size analysis results for synthetic data with particle density 15 particles per IA of 32×32 pixels (ppp=0.0147) are shown in Fig. 3.

Fig. 3
figure 3

Particle image diameters tested by correlation analysis and region growth method for synthetic data of std 0 pixels and 1 pixel

Based on the analysis of different particle image size distributions and particle image densities, it is noticed that the shape of the CWS bilinear bias error is typically sinusoidal. For this reason, two or three sine functions can be added in the bilinear image sub-pixel interpolation schema. This does not increase the computational load significantly because most of the computational time is used in the intermediate velocity estimation phase. In the following equations, two sine correction terms are added in the bilinear interpolation schema introduced by e.g., Gui and Wereley (2002). In Eqs. 3 and 4, (I, J), (x, y), and (x c , y c ) are discrete, non-negative sub-pixel, and corrected sub-pixel values of the window shift, respectively. In Eq. 5, G2(i, j) is a second-exposure original gray-value in position (i, j), and g2(i, j) is the corrected, new gray-value in position (i, j).

$$\begin{aligned} X = I + x \\ Y = J + y \\ \end{aligned} $$
(3)
$$ \begin{aligned} & x_{c} = x + a \cdot \sin {\left( {2\pi x} \right)} + b \cdot \sin {\left( {4\pi x} \right)} \\ & y_{c} = y + a \cdot \sin {\left( {2\pi y} \right)} + b \cdot \sin {\left( {4\pi y} \right)} \\ \end{aligned} $$
(4)
$$ \begin{aligned} g_{2} {\left( {i,j} \right)} = & {\left( {1 - x_{c} } \right)} \cdot {\left( {1 - y_{c} } \right)} \cdot G_{2} {\left( {i + I,j + J} \right)} \\ & + x_{c} \cdot {\left( {1 - y_{c} } \right)} \cdot G_{2} {\left( {i + I + 1,j + J} \right)}{\left( {1 - x_{c} } \right)} \cdot y_{c} \cdot G_{2} {\left( {i + I,j + J + 1} \right)}x_{c} \cdot y_{c} \cdot G_{2} {\left( {i + I + 1,j + J + 1} \right)} \\ \end{aligned} $$
(5)

In Fig. 4, the mean bias when a=0 and b=0 is shown, and also the calculated compensation value with two sine functions when a=0.031 and b=0.005. In this simple case, the tuning parameters a and b are found by a trial and error method. The tuning is performed for the synthetic data set with an image mps of 1.5 pixels and std 1 pixel. If the CWS bilinear interpolation algorithm with this correction method is compared with the CWS cardinal interpolation algorithm, it gives practically the same accuracy. Later, this particle image analysis procedure is performed in Sect. 6.

Fig. 4
figure 4

Mean bias error after three iterations of the CWS bilinear algorithm with a=0 and b=0, and the compensation value with a=0.031 and b=0.005 for synthetic data sets of mps 1.5 pixels and std 1 pixel

The mean bias and rms errors with the different particle sizes of the synthetic data are shown in Figs. 5 and 6, respectively. Also, an example of the CWS algorithm with compensation is presented. The particle density is 15 particles per IA of 32×32 pixels (ppp=0.0147), and the mean particle size varies between 1.0 pixels and 2.5 pixels, while the std is 1 pixel. The chosen standard deviation leads to more realistic particle size distributions, and, thus, the generated images are rather more comparable with real images of this paper. The pre-computation is performed again by the DWS1plus1 algorithm, and CWS has three iterative loops.

Fig. 5
figure 5

Mean bias errors in pixels as a function of the fractional displacement between 0 pixels and 1 pixel for the CWS methods for numerically generated PIV images with different particle sizes

Fig. 6
figure 6

Rms errors in pixels as a function of the fractional displacement between 0 pixels and 1 pixel for the CWS methods for numerically generated PIV images with different particle sizes. For legend, see Fig. 5

4 CWD algorithm

In addition to the plain linear displacement of CWS algorithms, the deformation algorithm, CWD, similar to that discussed by Scarano and Riethmuller (2000), is tested with the bilinear interpolation. The input velocity field for CWD is computed via DWS1plus3, having always the final IA size of 32×32 pixels. Thus, the refinement of the size of the IAs is not needed between the deformation loops, but, if it is necessary to decrease the IA size, it is always possible to continue the computations from this point on. In this paper, the intermediate velocities between the iteration loops are computed by FFT-CC and DNCC. The advantage of DNCC is that it allows any size of IAs, if the refinement is used, and also, the bias error is decreased. This paper is limited to the second-exposure image interpolation and deformation.

The CWD algorithm is also tested with a higher overlapping (75%) than the other methods. Only the first-order deformation is used, but all the velocities calculated inside and at the borders of a particular IA are utilized. As an example with 75% overlapping, four velocity estimates at the corners of a 8×8-pixel area define the velocity estimates U(I, J) and V(I, J) in each pixel location (I, J) according to the following Eqs. 6 and 7:

$$ U{\left( {I,J} \right)} = {\left( {8 - I} \right)} \cdot {\left( {8 - J} \right)} \cdot {U_{1} } \mathord{\left/ {\vphantom {{U_{1} } {64}}} \right. \kern-\nulldelimiterspace} {64} + I \cdot {\left( {8 - J} \right)} \cdot {U_{2} } \mathord{\left/ {\vphantom {{U_{2} } {64}}} \right. \kern-\nulldelimiterspace} {64} + {\left( {8 - I} \right)} \cdot J \cdot {U_{3} } \mathord{\left/ {\vphantom {{U_{3} } {64}}} \right. \kern-\nulldelimiterspace} {64} + I \cdot J \cdot {U_{4} } \mathord{\left/ {\vphantom {{U_{4} } {64}}} \right. \kern-\nulldelimiterspace} {64} $$
(6)
$$ V{\left( {I,J} \right)} = {\left( {8 - I} \right)} \cdot {\left( {8 - J} \right)} \cdot {V_{1} } \mathord{\left/ {\vphantom {{V_{1} } {64}}} \right. \kern-\nulldelimiterspace} {64} + I \cdot {\left( {8 - J} \right)} \cdot {V_{2} } \mathord{\left/ {\vphantom {{V_{2} } {64}}} \right. \kern-\nulldelimiterspace} {64} + {\left( {8 - I} \right)} \cdot J \cdot {V_{3} } \mathord{\left/ {\vphantom {{V_{3} } {64}}} \right. \kern-\nulldelimiterspace} {64} + I \cdot J \cdot {V_{4} } \mathord{\left/ {\vphantom {{V_{4} } {64}}} \right. \kern-\nulldelimiterspace} {64} $$
(7)

For the fast convergence and stable operation of the CWD method, it is suggested that the spatial filter, size on the order of the IA, is applied in the intermediate velocity fields, except for the last loop. Typically, a converged result is achieved in only three iteration loops (Scarano 2004). The disadvantage is that the best resolution and the small velocity scales are lost. In addition to this, the spatial filtering increases the residual, which is an average estimate of absolute residual velocities after each iteration loop, as reported by Scarano (2004). Small velocity scales should not be filtered, and, thus, it is suggested in this paper that the size of the spatial filter is gradually decreased. The iterations are continued, e.g., one loop per filter size, until the smallest possible filter size is achieved. In the study of this paper, a moving average Gaussian-type filter is utilized between the intermediate velocity vector fields. Different Gaussian filters are tested, and they are marked by G3×3 or G5×5, depending on the filter size. In addition to this, for the proper operation of the filtering procedure, the velocities outside the flow region by the wall of the TBL are set to zero. The velocity estimation of the last iteration loop is always performed without filtering, such as that by Scarano (2004). The DNCC method can be used to improve the accuracy of the velocity estimation of the final loop, and it is used here in a hybrid form to calculate only the correlation peak vicinity.

5 Computational performance

The computational performance of different algorithms is compared for an IA of 32×32 pixels. The performance analysis is done bearing in mind real data measured with PIV, and, thus, for the iterative algorithms DWS and CWS, the first iteration is done with an IA of 64×64 pixels. This is necessary to avoid erroneous initial shifts. For the DNCC methods, the computation is performed directly with an IA of 32×32 pixels. With the DWS algorithm, the refinement of the IA size is not necessary with every iterative loop, but successive iterations with the same size of IA may improve the results and, sometimes, even correct the spurious vectors. In addition to this, the extra loops with small IAs decrease the computational efficiency only moderately. Thus, three loops with the same size of IA of 32×32 pixels are used in this example.

In Fig. 7, DNCC-FFT (214%), DNCC hybrid (179%), CWS bilinear (214%), and CWD (220%) algorithms are shown in the same computational performance class. The CWS bilinear algorithm with bias correction (293%) is computationally in a totally different class to the CWS cardinal interpolation algorithm (3,100%). These performance numbers include DWS1plus3 pre-computation.

Fig. 7
figure 7

Computational performance of different algorithms. The algorithms are compared to the performance of DWS1plus3, which is given the relative processing time of 100%. For the DNCC hybrid and the CWS and CWD methods also, the extra time needed (%) is shown if the DWS1plus3 data is available

For the hybrid DNCC method, the data is pre-computed also by DWS1plus3. The performance of the hybrid DNCC method varies because, in some cases, the peak is not found inside the 3×3-pixel area. This performance test is run with the simulated data. Also, with the real data, sometimes, the 5×5-pixel area is calculated by DNCC. In addition to the CWD velocity estimation, different kinds of fast modifications for hybrid algorithms, for example, the pre-computation performed only with standard FFT-CC, may also be possible with the real-time PIV applications, like the study by Piirto et al. (2002).

It must be noted here that the CWD algorithm is very fast, and often, the converged result after the pre-computation can be acquired in just one or two iterative loops when the spatial filter on the order of the IA size is applied. If the filter size decreasing technique is added, another one or two loops are necessary, and the performance is on the same order of magnitude as with CWD3 G3×3 of Fig. 7. If the data is validated between the iterative loops, this will increase the computational load, and this is not taken into account in these performance tests.

6 Experimental tests

The algorithms’ ability to handle peak-locking and their accuracy in real PIV data is analyzed. The first turbulent flow data set is measured in a forward-facing step (see Eloranta et al. 2003). The second data set is measured in the near-wall region of a TBL at Re τ =510. The average size of the particle images for both data sets is analyzed by the region growth algorithm and the correlation analysis. The region growth algorithm, depending on the direction, gives average diameters of 1.4–1.5 pixels and 2.0–2.4 pixels for accelerating flow and TBL image samples, respectively. The correlation analysis gives 1.0–1.1 pixels for an accelerating flow image sample and 1.7–1.8 pixels for a TBL image sample. Thus, with the accelerating flow data set, the particle images are about 0.5–1.0 pixels smaller in diameter than in the TBL data set. Sample images of both data sets are shown in Figs. 8 and 9. Mean bias error compensation with CWS is tested with the TBL data having a similar particle image size distribution as the synthetic data set that is used in the compensation example shown in Fig. 4. The mean bias error with the small particle image size of accelerating flow data and CWS bilinear interpolation is diminutive, and the best results with the lowest peak-locking effect are received without compensation. The other algorithms are the same for both the tests.

Fig. 8
figure 8

Sample image of accelerating flow data set. Particle images are smaller than in the samples of the TBL data set. The average diameter with the region growth algorithm is 1.4 pixels in the x direction and 1.5 pixels in the y direction

Fig. 9
figure 9

Sample image (first exposure) of the TBL data set. A reflection by the wall is shown in the upper part, but no reflection can be found in the second exposure. The average diameter with the region growth algorithm is 2.4 pixels in the x direction and 2.0 pixels in the y direction

The PIV system consists of a Nd:YAG double-cavity laser and an 8-bit CCD camera with a resolution of 1,008×1,008 pixels. The light sheet thickness is about 1 mm for both test cases. The camera magnification is M=0.1 for the accelerating flow test and M=0.2 for the TBL test. The seeding density is about 10 particles per IA, and the water flow is seeded by silver-coated, neutrally buoyant hollow glass spheres with an average size of 10 μm. The flow directions are streamwise (x), vertical (y), and spanwise (z), respectively. The corresponding velocities are denoted by the plain variables U, V, W, and their fluctuating parts by u, v, w (in graphics).

6.1 Accelerating flow

In Fig. 10, the streamlines of flow through a 2:1 contraction of a forward-facing step are shown. The measured image area is 100×100 mm. The flow speed increases from 1 m/s to 2 m/s, corresponding to particle displacements of 7–15 pixels. The location of the measured xz plane is marked with a line. The rms values are rather low and close to the accuracy limit of the PIV system. In Figs. 11 and 12, the mean and rms profiles in the x direction for streamwise velocities, respectively, are plotted. Rms fluctuations due to peak-locking are easily observed in a steadily accelerating low-turbulence channel flow. Note that the minimum rms values coincide with the integer values of the mean pixel displacements. The peak-locking causes mean a bias error of about 0.1, 0.05, and 0.05 pixels for DWS, DNCC, and CWS with cardinal interpolation, respectively. For CWS with bilinear interpolation, the mean bias error is very small, less than 0.02 pixels.

Fig. 10
figure 10

Streamlines of accelerating flow through a 2:1 contraction of a forward-facing step (h=50 mm). The measurement plane is marked with a dashed line

Fig. 11
figure 11

Streamwise profiles of mean velocity calculated with the different algorithms

Fig. 12
figure 12

Streamwise profiles of rms calculated with the different algorithms. For legend, see Fig. 11

The CWS bilinear interpolation algorithm can prohibit peak-locking better than the other algorithms. The rms profile by CWS cardinal interpolation 11×11 is similar to the profile obtained by the DNCC method. This result is in good agreement with the artificial data tests with the same algorithms. Compensation for bilinear interpolation is not necessary. Even though the particle image size analysis gives a size classification on the order of 1.5 pixels, there exist many low-intensity particle images, often containing only 1 pixel, which have not passed the threshold level of the region growth analysis. The number of those particle images may place an emphasis on even smaller particle images than 1.5 pixels. For smaller particles than this, the mean bias error starts to decrease, and, therefore, peak-locking is diminutive in this particular case. However, this is not systematically tested by the synthetic data of this paper, but the particle image distribution of this data set seems to be very good for the CWS algorithm with bilinear interpolation.

DWS is the weakest algorithm to avoid peak-locking. Also, it cannot be automatically expected that the deformation methods will work better than the non-deformation methods. With CWD algorithms, the general level of rms is increased between 0.02 pixels and 0.06 pixels, depending on the size of the chosen spatial filter, and peak-locking of the rms estimate is on the same level as DNCC and CWS cardinal interpolation. The stationary tests are not performed to ensure that the rms values have reached their final value. However, because of the homogeneity assumption in the z direction, the amount of samples is 50×500 in each streamwise position.

6.2 Turbulent boundary layer

In the second data set, the experiments are carried out in a free-surface water tunnel facility. The data is measured in a near-wall region of a TBL at Re τ =510. The rms profiles of this data set are compared with the DNS data of Moser et al. (1999) for a TBL at Re τ =590 and with the DNS data of Kawamura (1994) for a TBL at Re τ =640. The tunnel provides controlled and disturbance-free flow to the test section, with a very low free-stream turbulence. A TBL is established on the bottom wall of the tunnel. A rectangular bump with a height of h=3 mm, not to be confused with h from the previous section, is used to trip the boundary layer at the entrance of the test section. The profiles are measured at 500h downstream of the bump. The width of the tunnel is about 200h, the height is about 250h, and δ is the boundary layer thickness in the measurement position. The measured image area is 25×25 mm. The friction velocity u τ is found by fitting the mean velocity to Spalding’s universal velocity profile (Spalding 1961). The profiles are shown in Fig. 13, together with the DNS mean velocity profiles. Table 2 gives the TBL properties. The ratio between the measurement resolution Δ (IA size) and the Kolmogorov length scale η is η/Δ=3.5. The rough estimate for dissipation used in the Kolmogorov length scale is ε=(u 3 τ /ky), in which k is the von Karman constant (k=0.41) and y is the distance from the wall corresponding to y+=30. The resolution with an overlapping of 50% is about 3.5 wall units.

Fig. 13
figure 13

Spalding’s universal velocity profile, DNS mean velocity profiles, and PIV mean velocity computed by the CWD algorithm and 75% overlapping

Table 2 Boundary layer properties

The size of the data set is 1,087 image pairs, and, because of the streamwise homogeneity assumption, the amount of velocity statistics is 60×1,087 at each wall-normal distance with an overlapping of 50%. The stationary analysis is performed by both dividing the data set in the x direction into two parts like by Piirto et al. (2001b), and also by computing the data in different time sequences. In these tests, the difference between the rms values is marginal, i.e., less than 0.5%. The data sets are validated before the estimation of the time-average turbulence quantities. The validation criteria are the same for all the sets, and the results are shown in Table 3. Spurious vectors have either unrealistic values marked as the reference limit, or they are detected by a local median criterion, similarly to Westerweel (1994). The DWS1plus3 and DNCC algorithms have the smallest amount of spurious vectors. The CWD algorithm without filtering is the most unstable one, and the number of detected spurious vectors is almost 1%. Modest filtering with the Gaussian type 3×3 spatial filter reduces the number of spurious vectors of the CWD algorithm to the same level as the non-deformation algorithms. The spurious vectors are replaced by interpolation.

Table 3 Spurious vectors with the different algorithms

In Figs. 1422, the rms and Reynolds shear stress results with the different algorithms are shown. Figures 14, 15, 16, 17, and 18 are the results of the algorithms without deformation. Figures 19, 20, 21, and 22 are the results of the deformation algorithms, two with 50% overlap and two with 75% overlap, and, for comparison, CWD with 50% overlap results without filtering. The rms profiles and Reynolds shear stress profiles are compared with DNS results, and the turbulence intensity results are given in values relative to the streamwise mean velocity. The turbulence intensity peak of streamwise velocity varies between 27.0% and 28.5% with all the algorithms, i.e., the difference is about 1.5%. The difference between the non-deformation algorithms, excluding DWS1plus3, is less than 0.5%, but the peak values seem to be too close to the wall compared to the DNS and CWD results. The turbulence intensity differences in the outer parts of the boundary layer (y+>50) are marginal, less than 0.2%. Deformation algorithms with the filtering techniques seem to be better for turbulence intensity estimation in the streamwise direction than the non-deformation algorithms, and the intensity peak location and the values near the wall fit well with the DNS results. It must be noted that only the boundary layer near-wall region is comparable, and there exists a clear difference between PIV and DNS turbulence statistics when y+>70. This may be due to the different initial and boundary conditions of the various cases.

Fig. 14
figure 14

Streamwise velocity rms calculated with different algorithms

Fig. 15
figure 15

Streamwise velocity rms peaks of Fig. 14. For legend, see Fig. 14

Fig. 16
figure 16

Wall-normal rms velocity calculated with different algorithms. For legend, see Fig. 14

Fig. 17
figure 17

Wall-normal rms velocity near-wall values of Fig. 16. For legend, see Fig. 14

Fig. 18
figure 18

Reynolds shear stresses calculated with different algorithms. For legend, see Fig. 14

Fig. 19
figure 19

Streamwise velocity rms calculated with deformation algorithms

Fig. 20
figure 20

Streamwise velocity rms peaks of Fig. 19. For legend, see Fig. 19

Fig. 21
figure 21

Wall-normal rms velocity calculated with deformation algorithms. For legend, see Fig. 19

Fig. 22
figure 22

Reynolds shear stresses calculated with deformation algorithms. For legend, see Fig. 19

For wall-normal velocity fluctuations, there exists a gap between the DNS results, the results of CWD, and the results of the non-deformation algorithms. Especially with CWD algorithms, the difference in the turbulence intensity level compared to the DNS results is 0.5–1.0%. Closer to the wall, the turbulence intensity varies between 4.5% and 9.0%, i.e., the difference is about 4.5% for the wall-normal velocities. Turbulence intensities are calculated in the second-to-last data point by the wall (y+=7). If the CWD no-filtering and DWS1plus3 algorithms are not included, the difference is about 1.5%. The low estimates by the wall with the DWS1plus3 algorithm seem to be caused by peak-locking, which is noticed especially in the velocity distributions calculated in Figs. 23 and 24. Generally, DWS1plus3 seems to give either lower or higher turbulence intensity values than the other non-deformation algorithms. The most difficult part of the flow for the deformation algorithms is wall-normal velocity fluctuations in the reflection area, which was shown in the sample image in Fig. 9. There is a clear step increase in turbulent wall-normal intensities (y+=15), especially with CWD and a 75% overlapping.

Fig. 23
figure 23

Streamwise velocity distributions of TBL data (counts)

Fig. 24
figure 24

Wall-normal velocity distributions of TBL data (counts). For legend, see Fig. 23

Deformation algorithms seem to be better in the estimation of the Reynolds shear stress uv, and the estimates are closer to the results of DNS than the estimates of non-deformation algorithms. Reynolds shear stresses are plotted in Figs. 18 and 22.

The peak-locking effect of DNCC is the second highest after DWS, although the mean bias error with simulated data is the lowest of all the algorithms. This is noticed especially in the velocity distributions of Fig. 23. CWS with cardinal interpolation and bilinear interpolation, together with correction, give quite similar results to each other, and the peak-locking effect is reasonably low. With the deformation algorithm, the peak-locking effect is the smallest.

The challenge with the deformation algorithms is the optimized filtering between the iteration loops, since the different filtering techniques have significant effects on the results. The filter size decreasing procedure, explained in Sect. 4, is marked in Figs. 1922 as: DWS1plus3 + CWD3 G5×5 + CWD1 G3×3. After the pre-computation, three loops are computed by CWD with a Gaussian filter size of 5×5, and one loop with a Gaussian 3×3 filter. After that, DNCC is used to compute the residual velocities. The described method gives higher rms results than the deformation method with a constant size G5×5 filter, and, thus, smaller velocity scales are included in the result. The theory of the relation between the rms error and the spatial filter size can be found in a paper by Fouras and Soria (1998), and a modification of it in a study by Piirto et al. (2003), derived for the accuracy analysis of derivative estimation, together with the spatial filtering. The residuals are shown in Table 4, and it can be noticed that the residuals of the described method and CWD G3×3 with a 50% overlap are almost the same, similar to their results, as shown in Figs. 1922.

Table 4 Residuals of CWD tests

7 Conclusions

Five cross-correlation-based particle image velocimetry (PIV) interrogation algorithms are compared with synthetic data and two real data sets containing small particles of size 1.0–2.5 pixels. The algorithms are: (1) fast Fourier transform cross-correlation (FFT-CC), (2) direct normalized cross-correlation (DNCC), (3) discrete window shift (DWS), (4) continuous window shift (CWS), and (5) FFT-CC CWS with image deformation (CWD). Different variations of the CWS and CWD algorithms are tested, and pre-computation for the CWS and CWD methods is performed via DWS. In the pre-computation, the first iterative loop of DWS is done with an interrogation area (IA) of 64×64 pixels, and the three following loops with an IA of size 32×32 pixels. Gaussian three-point peak fitting for sub-pixel estimation is used in all the algorithms. The computational efficiency of DNCC or CWS and CWD with bilinear interpolation is on the same order of magnitude, while, in contrast, the computation time of the CWS with cardinal interpolation (11×11 stencil) takes about 15 times longer.

The comparison starts with the numerically generated particle images with fractional displacements between 0 pixels and 1 pixel. According to the tests, CWS with bilinear interpolation and bias error compensation can reach the same accuracy as DNCC and CWS with cardinal interpolation methods. For the bias error compensation, the particle image size analysis and synthetic data with the same particle image size distribution are used for the calibration. With CWD algorithms, different Gaussian filter techniques are tested, and it is suggested that the spatial filter size be gradually decreased until the smallest possible filter size is achieved. The velocity of the final loop is computed by DNCC. The method decreases the residuals and respects small velocity scales better than a constant-size spatial filtering between the iterative loops.

Two real data sets measured in turbulent flows are examined. Rms fluctuations due to peak-locking are easily observed in a steadily accelerating low-turbulence channel flow. CWS bilinear algorithms can prohibit peak-locking slightly better than DNCC algorithms, but CWD algorithms do not show better performance than the non-deformation algorithms; the rms level is increased. In turbulent boundary layer (TBL) flow at Re τ =510, the best performance for the turbulence intensity of streamwise velocity fluctuations and Reynolds shear stress is found by CWD algorithms, together with the spatial filtering applied between the iteration loops. Like in the accelerating flow, the best estimation for the low-turbulence, i.e., wall-normal, velocity fluctuations is achieved with non-deformation algorithms. With a turbulence intensity level of about 6% by the wall (y+=7), the maximum difference is up to 4% in all the tested algorithms. For the streamwise velocity turbulence intensity peak, the maximum difference is about 1.5%. The turbulence intensities vary typically within 0.2% in the streamwise direction and within 0.5% in the wall-normal direction in the outer parts of the boundary layer (y+>50). Because the measurement resolution of the TBL case is under the Kolmogorov length scale and, in addition to this, there is a low-pass-filtering effect inherent in the PIV method, not all turbulence energy is recovered. It must be noted that PIV measurement rms error compensates for these low-pass-filtering effects. If the rms error is on the order of 0.06 pixels, like e.g., the residuals of the deformation algorithms, the corresponding non-dimensional streamwise rms is u+=0.1. The rms and Reynolds shear stress profiles calculated by the CWD methods seem to be, generally, in best agreement with the TBL direct numerical simulation (DNS) data of Moser et al. (1999) and Kawamura (1994).