1 Introduction

Cavitation occurs in hydraulic machinery and marine equipment, such as pumps, turbines and propellers. Formation of vapor cavities in the flows accompanying the operation of such devices is an undesirable process that reduces their efficiency and leads to their mechanical wear or even destruction under the influence of shock waves generated by collapses of vapor microbubbles. Studying conditions favorable for cavitation inception, evolution and breakup of vapor structures is an important part of research in both applied and fundamental science. Behavior of the vapor phase in cavitating flow (its location and form, dimensions of cavitation structures, vapor fraction, characteristic frequencies, etc.) is of a great interest. In order to evaluate all these parameters inherent in cavitation, modern measurement and processing techniques must be applied.

Significant technical progress in studying two-phase dispersed flows over the past decades, which has occurred due mainly to the evolution in technical means in parallel with modern methods of optical and laser-based diagnostics of two-phase dispersed flows, made it possible to accumulate a large volume of experimental data in the literature. However, the measurement techniques, among which Particle Image Velocimetry (PIV) is certainly principal, do not themselves provide comprehensive information on all features of dispersed phase distribution in a turbulent flow and the effect of local content of the dispersed phase on the flow characteristics. This additionally requires involving additional measurement approaches and sophisticated methods of data processing and analysis to extract more quantitative data.

Nowadays, the literature is rich in experimental studies of cavitating flow, containing a lot of PIV data for various test bodies and flow conditions. These PIV data are commonly presented in form of instantaneous velocity and vorticity fields (Tassin et al. 1995; Foeth et al. 2006; Wei et al. 2016), distributions of time-averaged velocity and different turbulent characteristics (Gopalan and Katz 2000; Huang et al. 2013; Pennings et al. 2015). However, such information is of course insufficient to thoroughly analyze the two-phase structure of a cavitating flow. In view of the mechanics of two-phase flows, the main interest is linked apparently with the behavior of a dispersed phase. In the case of cavitation, this implies the presence and expansion of a sheet cavity, the process of cloud cavity shedding, the vapor concentration in cavitation structures and so on. Despite its significance, a statistical analysis of basic experimental data for such flows is not yet available in the literature.

For the sake of fairness and completeness of this brief literature overview, it must be said that there also exists the X-ray attenuation method which makes it possible to take direct quantitative measurements of volumetric (not planar) vapor fraction in a two-phase medium (Bhatt et al. 2021; Ge et al. 2021; Maurice et al. 2021). In X-ray attenuation, vapor fraction is directly evaluated as instantaneous vapor concentration averaged over the entire depth of a two-phase flow, while PIV measurements are performed in a thin section coinciding with a laser light sheet. So, it is difficult to reconcile datasets obtained by these two approaches even if they are applied together. Moreover, X-ray radiation is known to be highly dangerous for human health, which imposes strict requirements on safety measures that must be in place when using it. Given that the emitting and registering equipment for this type of radiation is very specific and quite expensive as well as appropriate control and data processing software, application of this method becomes sometimes challenging. Thus, there is a need in a more suitable and simpler tool for estimating dispersed phase fraction in a two-phase flow.

The paper aims to show that, in order to obtain at least qualitative information about the location of a cavitation region with distinguishing a few of its zones that differ in the time-averaged vapor content, standard PIV measurements in the liquid phase are sufficient. For this, we present and verify a new efficient method of statistical analysis capable of gaining information on time-averaged distributions of the vapor (dispersed) phase in a cavitating flow evaluated statistically based on the probability of tracer absence in a certain flow region in original PIV data for liquid (continuous phase). It is especially useful given the widespread application of the PIV technique in both research laboratories and industrial facilities. The paper is structured in the following way. In the next section (Sect. 2), we provide most important details of the PIV approach employed to perform velocity measurements. Section 3 describes the idea and main principles of the suggested method of statistical processing. Next, we discuss significant features of verification and implementation of this method in Sect. 4, as well as estimate the statistical error associated with it. Finally, we draw concluding remarks in Sect. 5.

2 PIV approach

Here, we consider a cavitating flow around the 2D symmetric hydrofoil section mimicking a guide vane of a Francis turbine (Timoshevskiy et al. 2016) that was installed in the test channel of the cavitation tunnel in Kutateladze Institute of Thermophysics SB RAS (Kravtsova et al. 2014). Its chord length \(C = 100\;{\text{mm}}\) and aspect ratio is 0.8 (Fig. 1A). Further data analysis is only performed for the unsteady cloud cavitation regime (more details are available in Timoshevskiy et al. (2020) for Regime IV): the attack angle of 9°, Reynolds number \({\text{Re}}_{C} = U_{0} C/\nu = 1.32 \times 10^{6}\), cavitation number of 1.86, maximum sheet cavity length \(L_{C} /C = 0.53\), reduced frequency of the cavity pulsations and cloud shedding \({\text{St}} = fL_{C} /U_{0} = 0.35\), where \(U_{0} = 10.47\) m/s is the mean velocity of the incoming flow and \(f = 69\) Hz is the natural frequency of the cavity auto-oscillations.

Fig. 1
figure 1

Placement of the hydrofoil in the measurement domain: A its photograph when installed in the test channel and B its mask for PIV measurements relative to the reference grid. In image B, the hydrofoil longitudinal section in the PIV measurement plane is highlighted by pink contour with the diagonal hatching, the green contour with the inverse diagonal hatching corresponds to the frontal end surface of the test model adjacent to the test channel sidewall. Solid and dashed gray curves show approximate contours of the sheet cavity of the maximum and minimum lengths, respectively, in the unsteady cloud cavitation regime. The part of the hydrofoil profile depicted by the pink dashed curve is obscured by the attached cavity. Solid (observable) and dashed (obscured) pink lines under the hydrofoil show the borders of the hydrofoil shadow in the measurement plane

For velocity measurements in the cavitating flow, we employed the same PIV system as the one used previously in Timoshevskiy et al. (2020). Its sampling rate (the recording frequency of image pairs) was 4 Hz. Physical dimensions of the measurement region were equal to 120 × 50 mm (Fig. 1B). Fluorescent PMMA seeding particles (Microparticles GmbH, fraction 1–20 µm, reemission wavelength range 550–700 nm) suspended in the flow were illuminated by a laser light sheet with a 0.8 mm thickness that spatially coincided with the central vertical longitudinal plane of the test channel. A CCD-camera lens was equipped with an optical low-pass filter to implement the LIF approach and, thereby, to avoid the contaminating effect of vapor microbubbles on PIV images when the laser light is reflected to the camera from the bubble boundaries.

Processing of raw PIV images was performed in the same way as described in detail in Timoshevskiy et al. (2020). First, we applied two pre-processing procedures to enhance the quality of initial PIV data: subtracting the mean intensity field and masking. Instantaneous velocity vector fields were calculated on the pre-processed images using an iterative cross-correlation algorithm with a continuous shift, deformation and 75% overlapping of interrogation windows. The initial size of the interrogation window was chosen to be 64 × 64 pixels to guarantee a relatively large dynamic range of the measured flow velocity but then gradually reduced down to the final size of 8 × 8 pixels to gain a higher spatial resolution. This routine counted local tracer concentration in every interrogation window, so that velocity vectors were estimated only in those image areas where the number of the seeding particles was higher than a certain threshold (5 tracers per a 32 × 32-pixel computational cell in this study). The tracers were regarded as a convolution of a Gaussian mask of a 1-pixel radius with PIV images over a 5 × 5 pixel window, with the correlation coefficient threshold equaled 0.7. Sub-pixel interpolation of a cross-correlation peak was performed over three points, using a one-dimensional approximation by the Gaussian function. The sample volume (i.e., the size of the entire statistical ensemble of instantaneous measurements) was \(N = 5000\) image pairs.

It is worth noting that cavitation structures (Fig. 2A) are commonly free of the tracers as they do not populate the vapor phase (Fig. 2B), so the flow velocity is not measured inside the cavities in practice. In steady flow regimes, the flow velocity can be nevertheless measured very close to the wall where a cavitation sheet is absent or very thin (i.e., its transversal dimension is less than the interrogation window size in the iterative cross-correlation algorithm). However, under unsteady cloud cavitation conditions, an attached cavity exhibits global instability—auto-oscillations of its length accompanied by quasi-periodic cloud cavity shedding—and, consequently, the velocity measurements in the image areas occupied by the cavity turn out to be conditionally averaged. This means that, at the moments when the cavitation sheet is long enough, the measurements are actually impossible in the cavity region (inside the cavity) for the above-stated reason (Figs. 1B and 2) but, when the cavity is relatively short (and thin as a result), the flow velocity can be evaluated successfully in those areas (Fig. 1B). Hence, in the cloud cavitation regime, we measure the flow velocity in the cavity region only when the cavity length is reduced or cavitation totally disappears and do not measure it in the opposite case.

Fig. 2
figure 2

Typical view of the unsteady cavitating flow around the hydrofoil: A snapshot from high-speed visualization and B PIV image partially masked to hide the hydrofoil and its shadow underneath (white area). In image B, the hydrofoil color mask is the same as in Fig. 1B. The bright dots are the seeding particles in the measurement plane and the dark blurred regions are cavitation (vapor) structures that extend over the entire hydrofoil span

3 Statistical method of vapor–phase detection

A PIV database is a set of instantaneous velocity fields in a selected measurement domain for a given flow regime. Each element of a velocity field (i.e., a separate vector of instantaneous velocity) is assigned a certain status which determines the way of its further treatment. In particular, status “valid” displays a correct velocity measurement, i.e., such a velocity vector is regarded as calculated properly on a sufficient number of seeding particles (five or more per a 32 × 32-pixel area in this paper) and passed all validation procedures if any applied. Vectors evaluated on a deficient number of tracers in an initial PIV double-frame image (i.e., less than five per a 32 × 32-pixel area) are marked as “out-of-flow,” so they are not considered reliable enough. Status “outlier” is attributed to a velocity vector which direction and/or magnitude seems to be incorrect for some reason when validated, except for the number of particles (their number must be nevertheless sufficient as for a “valid” vector). These vectors are regarded as invalid and then typically interpolated not to keep empty areas in the velocity field, where velocity vectors are absent. In dark flow regions, which are not illuminated by the laser light sheet and correspond, for example, to the test model and its shadow, velocity vectors are “masked” (the number of particles does not matter at all) and, thereby, permanently excluded from further processing.

The idea of the new method of statistical analysis lies in processing of false event series to determine the type of their distribution law according to which properties of a two-phase flow can be interpreted. In case if the only condition for a velocity measurement (event) to be regarded as false is an insufficient concentration of tracers (i.e., when a velocity vector is assigned the “out-of-flow” status), then it can be assumed that a computational cell under consideration is occupied by the dispersed (vapor) phase at a given time instant. A bottleneck of the latter statement is the fact that attributing the “out-of-flow” status to a velocity vector may be not due only to the absence of the liquid phase in a given point of a liquid flow, but also because of the lack of tracers in this location, as well as a number of other possible factors, including technical ones. However, in the same computational cell, a similar series should also be shaped for true events, namely velocity vectors with the “valid” and “outlier” status. These vectors are indeed true since, by their definition, tracers are certainly registered within computational cells they correspond to, which clearly indicates the presence of the liquid phase. Thus, these two series reflect the effect of vapor cavities on properties of cavitating flow, and their joint analysis can presumably allow one to enhance the reliability of measurement results.

In this article, we sample event series according to the following algorithm. First, the vector status is considered in a given point on an instantaneous velocity field together with those in four adjacent mesh cells making a straight cross with the first one. Next, in order to enhance the reliability of the event occurrence, all realizations (separate measurements) are filtered retaining only the events with the same vector status (“valid” plus “outlier” for liquid or “out-of-flow” for vapor) at both current location and any of the four neighboring points. Every event series in a certain cell of the measurement domain is a time sequence with a step \(\delta t\) equal to the time delay between image pairs in the PIV experiment. Figure 3 shows an example of an event series extracted from the real experimental data, where unity goes for an event, while zero corresponds to its absence. Such series are then analyzed for the straight repeatability of these events (event chains) in a discrete sequence (Fig. 4). Sequences of the successive repetitions are used to construct histograms of the probability of the event chains of various length (Fig. 5). If this repeatability is governed by occurrences of unrelated (statistically independent) events in the point under consideration, their probability equals the product of the probabilities of a single event.

Fig. 3
figure 3

Portion of a typical event series for the vapor phase as a basic (analyzed) fluid (1—vapor, 0—liquid) at sampling point A (Fig. 1B) in the unsteady cloud cavitation regime

Fig. 4
figure 4

Sequence of event chains with different lengths \(i\) in chronological order for the event series shown in Fig. 3 discarding the measurements with false events

Fig. 5
figure 5

Histogram of the repeatability of registered event chains with various lengths \(i\) (see Fig. 4) for the event series shown in Fig. 3. The line displays the exponential distribution \(nP\left(i\right)={n\lambda e}^{-\lambda (i-1)}\) with parameters \(\lambda =1.34\), \(q=0.261\), \(n=1303\) and \(N=5000\) calculated for point A (Fig. 1B)

The probability density function of such a process has the form of an exponential function like \(P\left(i\right)\sim {q}^{i}\), where \(q\) is the probability of a single event and \(i\) is the number of consecutive repetitions of these events (the length of an event chain). Applying the normalization \(\sum_{i=1}^{n}P\left(i\right)=1\), this law turns into \(P\left( i \right) = \lambda e^{{ - \lambda \left( {i - 1} \right)}}\), where \(\lambda = - \ln q\) for \(i \ge 1\). \(q\) is determined by the ratio of the total number of events \(n\) to the sample volume (the number of measurements), so that \(q = n/N\). Figure 5 clearly demonstrates that the measured distribution of the event repeatability obeys an exponential relationship with the calculated probability of a single event. This confirms the random nature of sampling when collecting statistics in the experiment. The quantity \(q_{V} = n_{V} /N\), where \({n}_{V}\) is the total number of events for the vapor phase, must match the coefficient of local volume vapor fraction \(\beta ={V}_{V}/({V}_{V}+{V}_{L})\), where \(V_{V}\) and \(V_{L}\) are local volumes of the vapor and liquid, respectively. Its magnitude is interrelated with \(q_{L} = n_{L} /N\), so that \(q_{V} = 1 - q_{L}\), with \(n_{L}\) reflecting the overall number of events for the liquid phase. In the present study, this relationship (\(q_{V} + q_{L} = 1\)) is fulfilled with an accuracy of 0.04 for \(N = 5000\), i.e., the statistical error \({\Delta } = 1 - \left( {q_{V} + q_{L} } \right)\) does not exceed 4% (Fig. 6).

Fig. 6
figure 6

Distribution of the statistical error over all sampling points considered in the measurement domain around the hydrofoil (Fig. 1B) (the total number of the points is \({N}_{p}=722\))

Since the method of statistical analysis is developed as an extension to standard PIV technique, it is applicable to similar sets of raw data and the same rules of use are valid for it. In particular, this implies that the particle image diameter must be 2 to 4 pixels and the number of particles should exceed 10 per interrogation window irrespective of the physical flow scale (Adrian and Westerweel 2010). However, in practice, the last condition is often difficult to follow. That is why, in this paper, this limit is decreased to 5 tracers per 32 × 32 pixel area which is the typical size of interrogation window in PIV. This is all the more justified in the case of identification of the vapor phase. If the number of tracers is below this threshold, the accuracy of the method of statistical analysis would be considerably reduced as a lower concentration of particles must evidently lead to a significant loss in the sensitivity. The upper bound for the particle concentration is practically unlimited, although individual tracers must nevertheless be discernible to avoid speckle patterns where they would overlap. There are also other general requirements for PIV approach (Adrian and Westerweel 2010) which however seem less important for the present method of statistical analysis and, therefore, omitted here.

4 Implementation of the method of statistical analysis

The results of application of the developed method to real experimental data are presented in Fig. 7 where either vapor or liquid is considered as the basic fluid. The two fields display qualitatively the same observation—the probability of vapor occurrence increases in the region of attached cavity and progressively decreases away from the hydrofoil, which is intuitively clear and true. However, an advanced analysis of these distributions is complicated due to their little informativeness. In order to facilitate further examination of these fields, all sampling points in the measurement domain of the cavitating flow are roughly divided into six qualitative groups (approximate zones of the time-averaged local vapor content) with respect to the value of the probability of a single event (see Table 1). Below, we only consider the points where events are registered in no less than 10% of measurements (i.e., \(n/N\ge 0.1\)) to minimize an influence of the statistical error. The points with \(n<0.1N\) and nonzero first mode (bin) of the histogram (Fig. 5) are simply assigned to rare bubbles in liquid or rare liquid inclusions in the attached vapor cavity (Zone VI).

Fig. 7
figure 7

Distributions of the probability coefficient q for the A vapor and B liquid phase in the cavitating flow around the hydrofoil when the sample volume \(N=5000\). The hydrofoil color mask is the same as in Fig. 1B

Table 1 Approximate flow zones with different time-averaged local vapor content distinguished on the basis of the probability of a single event \(q\) to facilitate a qualitative physical understanding of the observed process

Figure 8 shows that the sample volume \(N\) (the total number of statistical realizations) practically does not have an impact on the disposition and shapes of the zones. Approximate position and dimensions of the attached cavity can be estimated in Fig. 9. Comparing Figs. 8 and 9, it can be clearly seen that the cavity size and location in the photograph match Zones I and II in the fields calculated using the developed algorithm. Note that, when vapor is treated as the basic fluid in this analysis, Zone I appears to be split roughly in the middle (47 mm ≤ x ≤ 49 mm) by a small portion of Zone II (Fig. 8A.1 and 8A.2). This must be caused by periodic detachments of cloud cavities from the cavitation sheet right in this flow region and their subsequent shedding (Fig. 10). In the distributions for the liquid phase (Fig. 8B.1 and 8B.2), the stationary part of the attached cavity is not so clearly pronounced and, as a result, the place of periodic detachments of vapor clouds is impossible to identify.

Fig. 8
figure 8

Distributions of the sampling points among the six zones indicated in Table 1 in the cavitating flow around the hydrofoil for different basic (analyzed) fluid (A—vapor and B—liquid) and two sample volumes: (1) \(N=1500\) and (2) \(N=5000\). The hydrofoil color mask is the same as in Fig. 1B

Fig. 9
figure 9

Time-averaged photograph from high-speed imaging of the cavitating hydrofoil in the unsteady cloud cavitation regime relative to the reference grid

Fig. 10
figure 10

Instantaneous velocity vector field around the cavitating hydrofoil in the unsteady cloud cavitation regime right after detachment of a cloud cavity from the cavitation sheet. Every sixth and every third vectors are shown in the streamwise and transversal directions, respectively. The hydrofoil color mask is the same as in Fig. 1B

The necessary condition for applicability of the method of statistical analysis is that sampling in experiment must be random for the entire ensemble of realizations as it is required to exclude any possible correlation of the process of collecting experimental data with own periodic phenomena in the flow under investigation. This guarantees that information on distribution of the probability of vapor occurrence is reliable. In this study, such a periodic phenomenon is the shedding of cavitation clouds. In order to ensure that the measurements are statistically independent and, consequently, the calculated values of the time-averaged local vapor content are valid, below we carefully analyze the matching of the obtained histograms to theoretical exponential distributions. The accuracy of evaluating the probability of vapor occurrence in a given point, except for the technical issues influencing the precision of tracer identification (see Sect. 3), is mainly determined by the volume of a statistical ensemble of realizations (sample volume). It depends on the detectable lower bound of the probability: the smaller it is, the larger the sample volume must be. The robust criterion for the sufficiency of statistics is the absence of significant discrepancies in the calculated probabilities between two random samplings with substantially different volumes.

The goodness of fit is verified separately for vapor and liquid according to the Pearson’s chi-square test (Chernoff and Lehmann 1954) by comparing a histogram of the event repeatability (like the one shown in Fig. 5), which is a function \({h}_{i}=h(i)\), where \(i\) is the sequence number of a histogram bin and \({h}_{i}\) is its height, with the theoretical exponential law:

$$h\left( i \right)\sim nP\left( i \right) = n\lambda e^{{ - \lambda \left( {i - 1} \right)}} ,$$
(1)

which is dependent of the overall number of events \(n\). The null hypothesis is verified by the Pearson’s criterion that is expressed as follows:

$$\chi^{2} = \mathop \sum \limits_{i = 1}^{{m^{*} }} \frac{{\left( {h_{i} - kP\left( i \right)} \right)^{2} }}{kP\left( i \right)},\quad k = \mathop \sum \limits_{i = 1}^{{m^{*} }} h_{i} ,$$
(2)

where \({m}^{*}\) is the number of bins of the histogram over which the test statistics is calculated.

Since experimental histograms can be largely irregular, i.e., have gaps (bins of zero heights) and rather long tails (few measurements of long event chains), it is required for a correct implementation of the chi-square test to determine a significant part of these histograms—the number of modes (bins) without gaps \({m}^{*}\), having a sufficient reliability. This implies that the number of event chains in each bin must be a minimum of height \(H\). For a reliable analysis of the histograms, modes with heights of \({h}_{i}\ge H=5\) (i.e., at least five event chains for every mode) are only used further. Cramér (1999) recommended that the number of events in an interval be at least 10. However, according to Cochran (1954), in practice it is admissible for the number of events to be less than 5 in utmost intervals. Mann and Wald (1942) argued that it is even acceptable to reduce this number to one in a distribution tail.

Once the distribution law is known (1), it is possible to interrelate the number of significant modes with the minimum number of event chains, so that \(H\left({m}^{*}\right)=nP\left({m}^{*}\right)\). It follows that

$$m^{*} = 1 + \frac{{{\text{ln}}\left( {n\lambda /H} \right)}}{\lambda }.$$
(3)

Next, we need to assess characteristic number of the significant modes \({m}^{*}\). For this, let’s assume that the total number of events in a series is equal to half of the sample volume, i.e., \(n=N/2\) (upper estimate). Such an evaluation comes from the condition for a true event that a certain vector in a velocity field must have at least one neighbor with the same status. Then, for \(N=5000\) and \(1500\), the total number of events occurs to be \(n=N/2=2500\) and \(750\), respectively.

The ultimate number of the significant modes can be calculated using formula (3) by equating \(H\) to 1. Values of \({m}^{*}\) are given in Table 2 for two utmost magnitudes of the event probability \(q=0.25\) and \(0.95\). It can for example be seen that, in the case of \(n=2500\) and \(q=0.25\), modes higher than 5 must contain no more than 5 event chains, while for \(n=750\), they are limited to four event chains (Table 2). That is why, when applying the Pearson’s criterion (2) for \(N=5000\) and \(1500\), we only consider the first five modes of histograms, i.e., \({m}^{*}=5\). Points with the number of the significant modes less than 5 are excluded from further analysis with attributing them to rare bubbles in liquid or rare liquid inclusions in the attached vapor cavity (Zone VI in Table 1) like for those where \(n/N<0.1\) (see above).

Table 2 Maximum number of the significant modes of a histogram \({m}^{*}\) for various combinations of the main parameters

It is worth noting that the chi-square distribution \({\chi }_{r}^{2}\) turns out to be very sensitive to the value of parameter \(\lambda\) of the theoretical law. This parameter can be found according to the obtained repeatability sequences (event chains) in the two following ways:

  1. 1.

    using the averaged value of histogram modes (i.e., the average repeatability of event chains) that is by definition of the expected value of a random variable in the continuous case \(\langle m\rangle =\lambda {\int }_{1}^{\infty }\xi {e}^{-\lambda (\xi -1)}d\xi =1+1/\lambda\), hence \({\lambda }_{1}=1/(\langle m\rangle -1)\);

  2. 2.

    basing entirely on the first mode of a histogram as the most reliable one because of inevitable limitation of the sample volume of a statistical ensemble for \(N<\infty\), resulting in \({\lambda }_{2}=P(1)\).

In order to find \(\langle m\rangle\) for \({\lambda }_{1}\), we used heights of the modes of an experimental histogram, so that \(\langle m\rangle =\frac{1}{{m}_{0}}\sum_{i=1}^{{m}_{0}}i\bullet {h}_{i}\), where \({m}_{0}\) is the number of nonzero bins before the first zero mode.

Additional tests showed that, for histograms with \({m}_{0}<20\), a distribution with \({\lambda }_{2}\) is more convenient but, if \({m}_{0}\ge 20\), the one with \({\lambda }_{1}\) suits better. Below, we take its weighted average value for processing:

$$\lambda = \gamma \lambda_{1} + \left( {1 - \gamma } \right)\lambda_{2} ,\quad \gamma = \left\{ {\begin{array}{*{20}l} {m_{0} /20} \hfill & { if\quad m_{0} \le 20} \hfill \\ 1 \hfill & {if \quad m_{0} > 20} \hfill \\ \end{array} } \right..$$

When checking goodness of fit to the Pearson’s criterion (2), heights of all histogram modes are multiplied by the ratio of areas under a theoretical distribution in the interval \(i\in \left[1;{m}_{0}\right]\) and a corresponding experimental histogram. This is necessary to increase the histogram area which is obviously underestimated due to the limited volume of experimental statistics (\(N<\infty\)), adjusting it to the theoretical law. It is evident that such a correction does not affect the slope of the distributions which determines a particular zone that a sampling point belongs to (see Fig. 11).

Fig. 11
figure 11

Histograms of the repeatability of registered events for the vapor phase (\(N=5000\)) in five sampling points in the measurement domain (Fig. 8) corresponding to different zones (Table 1). The lines show boundaries of the zones that are theoretical exponential functions \(nP\left(i\right)={n\lambda e}^{-\lambda (i-1)}\) calculated for the four limiting values of the probability \(q\)

The critical \({\chi }_{r}^{2}\) value necessary to assess the reliability of goodness of fit between theoretical and experimental distributions depends, according to the Pearson’s chi-square test, on the number of degrees of freedom \(r=m-k-1\), where \(m\) is the number of histogram modes and \(k\) is the number of parameters of the theoretical distribution. For \(m={m}^{*}=5\), \(r=3\) as the theoretical distribution is only dependent of one parameter \(\lambda\) (\(k=1\)). A significance level in the \({\chi }_{r}^{2}\) distribution is usually taken equal to \(\alpha =0.05\). Hence, its critical value is \({\chi }_{r=3}^{2}=7.8\). This significance level means that the probability of erroneous rejection of the hypothesis on the correspondence of a histogram to an exponential distribution, when it is true, does not exceed 5% or, in other words, we admit that some histograms conforming to this law can be rejected with a 5% probability. It is seen in Fig. 12 that distributions in some of the sampling points do not satisfy the Pearson’s criterion. However, the disposition and shapes of the zones are mostly the same as in Fig. 8. The absence of some points in Fig. 12 is basically associated with an increased statistical error in corresponding flow region. Besides, it can be also explained by either insufficient statistics for the correct application of the Pearson’s criterion or imperfect determination of the parameter \(\lambda\) in the theoretical exponential distribution.

Fig. 12
figure 12

Distributions of the sampling points satisfying the Pearson’s criterion among the six zones indicated in Table 1 in the cavitating flow around the hydrofoil for different basic (analyzed) fluid (A–vapor and B–liquid) at the sample volume \(N=5000\). The hydrofoil color mask is the same as in Fig. 1B

Figure 13 shows overlapping of the zones for the vapor and liquid phases from Fig. 8. As said above, Zones I and II for vapor correspond to the stationary and pulsating parts of the attached cavity with rare liquid inclusions, respectively. It is possible to distinguish in Fig. 13 that vapor (the two mentioned zones for vapor) present together with liquid (Zone I for liquid) and individual bubbles (vapor) in some points, thereby spatially falling into the same flow region. This implies that, if there is pure vapor with rare liquid inclusions in the first case and pure liquid with rare bubbles in the second case, it must be a vapor–liquid mixture with varied vapor content in between. The fact that the relation \({q}_{V}+{q}_{L}=1\) is fulfilled with good accuracy is expressed in the almost antisymmetric patterns of the zones for the vapor and liquid.

Fig. 13
figure 13

Distributions of the sampling points among the six zones indicated in Table 1 in the cavitating flow around the hydrofoil for the vapor and liquid phases together at the sample volume \(N=5000\). The hydrofoil color mask is the same as in Fig. 1B

Thus, we can infer that the simple algorithm for identification of the time-averaged local vapor content zones based on determining the probability of a single event as \(q=n/N\) is applicable and quite effective in analyzing the two-phase structure of cavitating flows. Using this method, it is mandatory to make sure that the statistical sampling in measurements is random. Otherwise, distributions of the event repeatability would differ from the exponential law, and calculated values of the local vapor content would appear incorrect. It is worth noting that the method of statistical analysis gives results that can be only considered as the upper limit of the vapor content since it does not allow distinguishing of a continuous film of pure vapor from a frothy medium. In a flow region occupied by a frothy medium, the vapor content must be apparently less than that in the same region with a continuous vapor cavity.

5 Conclusions

A new method for gaining information on distribution of vapor (dispersed phase) in cavitating flow is presented in the article, which is based on statistical processing of PIV data for liquid (continuous phase). The developed approach uses two main principles: the absence of tracer particles in the vapor phase, which are employed to implement PIV in the liquid, and statistical independence of successive velocity measurements. As follows from these principles, two sets of statistical information can be extracted from the PIV data: sequences of true and false events relating to the liquid and vapor phases, respectively. Then, a statistical law of repeatability of these events is analyzed for both datasets. If the measurements are indeed independent, the event repeatability obeys an exponential distribution law, while the coefficient of local vapor content in a given point of a flow is equal to the ratio of the number of false events to the sample volume (total number of measurements).

We carefully analyze the goodness of fit of the event repeatability to a theoretical exponential distribution using the Pearson’s chi-square test. This confirms the statistical independence of the measurements and, consequently, the reliability of calculated values of the coefficient of local vapor content. Constructed fields of this coefficient in the flow region under investigation make it possible to recognize the interface of an attached cavity on a test hydrofoil, its most stable and pulsating parts as well as to determine the time-averaged dimensions of the sheet cavity and to find the place of detachments of cloud cavities from the hydrofoil surface. An increase in the sample volume does not significantly affect the accuracy of finding the cavity location and evaluating its size and shape, 1500 instantaneous realizations seem to be sufficient. However, it is worth noting that the presented method of evaluating the vapor content based on the probability of the absence of tracers in an interrogation window gives results that can be only considered as the upper limit of the vapor content since this approach is not capable of distinguishing a continuous film of pure vapor from a frothy medium.

The method applied to both datasets for the vapor and liquid phases allows obtaining more complete information about the cavitating flow and assessing the statistical error of the developed method. This is important for cases with scarce statistics, especially for flow regions with a small number of events for liquid (rare liquid inclusions) or vapor (rare bubbles) and, as a result, less reliability of a considered quantity. In particular, when analyzing solely the liquid phase, the location of cloud cavity detachments is impossible to reveal even for a sample volume of 5000 realizations. At the same time, the analysis on vapor allows registering both the stationary part of the cavitation sheet and the place of the cloud detachments for 1500 measurements. However, the upstream frontier of the cavitation zone (the border between pure liquid and vapor–liquid mixture), where the number of bubbles is minimal, is more accurately found by considering the liquid phase, remaining unchanged for both sample volumes (1500 and 5000).

In summary, when studying the two-phase structure of cavitating flows, application of the suggested method of statistical analysis allows one to simplify data processing, limiting experiments only to standard PIV measurements in the liquid phase and, thereby, reducing the time spent for data processing and analysis and computational resources consumed for large amounts of different kinds of experimental results in opposite case. For a more sophisticated analysis, it is necessary to consider both sets of PIV data for vapor and liquid. The joint analysis for both phases is capable of collecting sufficiently complete and reliable information about all characteristic zones with different two-phase properties in cavitating flow. Finally, it is also worth noting that the developed method of statistical analysis in principle can be applied to other types of two-phase dispersed flow, for example, in bubbly media or sprays where local content of dispersed particles also plays a crucial role.