1 Introduction

Studies on supersonic jet noise have been an object of research because the exhaust gases of the supersonic aircraft cause the noise contamination (Bailly and Fujii 2016; Raman 1999). In the imperfectly expanded jet conditions, the components of the supersonic jet noise can be classified into three categories: the turbulent mixing noise, the broadband shock-associated noise, and the screech tone (Tam 1995). The first two acoustic waves show a broadband spectrum in the frequency domain, while the screech tone has an intense sound pressure level and a discrete frequency. The generation mechanisms of these noises have been extensively investigated by performing numerical simulation and experiment (Powell 1953; Suzuki and Lele 2003; Tam 1995). Then, the dominant flow structures relating to supersonic jet noise have been clarified based on the modal decomposition such as singular value decomposition (SVD) and dynamic mode decomposition (DMD) (Xiangru et al. 2021; Andrew et al. 2017). A coherent structure which generates the screech tone by interfering with shock cell structure (Powell 1953) can be observed upstream. Then, large-scale turbulent structure and streak structure can be observed downstream.

Screech tone phenomena are known to promote mixing in downstream (Glass 1968; Krothapalli et al. 1986; Alkislar et al. 2003; Knowles and Saddington 2006). Glass (1968) discovered that the jet diameter of the axisymmetric underexpanded jet becomes larger due to increase of jet spread rate when acoustic feedback occurs by screech tones. Krothapalli et al. (1986) investigated the underexpanded jet from a converging rectangular nozzle and described the angle of jet spread is increased at the pressure ratio corresponding to the maximum screeching sound. In the asymmetric jet, this mixing promotion is caused by streamwise vortices in the mixing-layer region. The streamwise vortices produce a strong traverse outward velocity and enhance the transverse transport between the jet and the surrounding fluid there (Alkislar et al. 2003). Alkislar et al. (2003) explored the three-dimensional flow characteristics of a underexpanded rectangular jet by using a stereoscopic particle image velocimetry (stereo-PIV), and they elucidated that the three-dimensional deformation of the large-scale spanwise coherent vortical structures results in strong streamwise vortices. These spanwise vortical structures are generated by the self-excitation in the shear layer of the underexpanded jet. Husain and Hussain (1993) proposed the mechanism for the generation of the streamwise vortices based on the influence of mutual and self-induction of the spanwise vortical structures and corrobolated their proposed mechanism using numerical simulation data of the low-speed elliptic jet. They delineated the ribs consisting of streamwise vortices are formulated from the deformation of the spanwise vortical structures in detail.

The streak structure is also one of the structures which are associated with the streamwise vortices. The streak structure is the streamwise elongated structure and it is identified at low frequency and nonzero azimuthal wavenumbers through the mode decomposition such as spectral proper orthogonal decomposition (SPOD) and resolvent analysis (Nogueira et al. 2019; Pickering et al. 2020). The streamwise vortices have radial velocity components which lift fluid from low- to high-speed regions (leading to a low-speed streak) and from high- to low-speed regions (leading to a high-speed streak) (Nogueira et al. 2019). This phenomenon is referred to as lift-up mechanism, and the streak structure and the lift-up mechanism have long been understood as an important mechanism in wall-bounded flows (Brandt 2014; Schlatter et al. 2008; Hack and Zaki 2016). However, it is known that the wall is not necessary for the lift-up mechanism, and they have been identified in some previous works for the mechanism identification in jet flows (Nogueira et al. 2019; Pickering et al. 2020).

However, the discussion about the relationships between the mixing promotion by screech tones and the downstream structures such as the streak structure has not been conducted in detail by following the time histories of these structures in high temporal resolution.

Numerical simulations are superior to the experiment with regard to the spatial and temporal resolution of fluid data (Gojon and Bogey 2017; Arroyo et al. 2019; Li et al. 2020). However, its computational costs to analyze the data are so high that it could not obtain the flow field information in a longer time span. On the other hand, there is still a technological difficulty in measuring the experimental data in high spatial and temporal resolution with the existing measurement system. Therefore, we apply the spatiotemporal superresolution measurement technique for the experimental data and overcome it. Here, the superresolution technique includes the spatial superresolution (Fukami et al. 2023), which recovers the high spatial resolution data from the low spatial resolution data and was recently applied to experimental data (Ozawa et al. 2024), and the spatiotemporal superresolution, which recovers the high spatial and temporal resolution data from the combination of the low spatial but high temporal resolution data and the high spatial but low temporal resolution data. Here, the spatiotemporal superresolution measurement which adopts the latter is considered in this study.

The spatiotemporal superresolution measurement (Nickels et al. 2020; Zhang et al. 2020; Li and Ukeiley 2021) is the method to reconstruct the flow field in high spatial and temporal resolution by using data-driven techniques. Tu et al. (2013) has reconstructed a wake flow behind a model using a modified time-delay LSE (mTD-LSE) (Durgesh and Naughton 2010) that estimates the time-resolved flow field from the high sampling rate hot-wire measurement data and the correlation between the two. Here, proper orthogonal decomposition (POD) (Berkooz et al. 1993) is used to reduce the complexity of the physical fields before the correlation with the acoustic data is calculated. Li and Ukeiley (2022); Tinney et al. (2008) achieved a reconstruction of the time-resolved velocity fields in the axisymmetric subsonic jet using time-resolved pressure data. Ozawa et al. (2021) applied a similar framework into the aeroacoustic field of a Mach 1.35 supersonic jet, and successfully reconstructed the velocity fluctuations related to screech tone using a sparse linear regression model. They found that the estimation accuracy can be improved by least absolute shrinkage and selection operator (LASSO) regression (Ozawa et al. 2022). This method selects the high-correlative acoustic data with the velocity field and avoids over-learning. Although the smooth convection of the flow fluctuation related to the screech was observed (Ozawa et al. 2022), its reconstruction was limited to the screech phenomena. In addition, the spatiotemporal superresolution was also successfully applied to the density measurement using background oriented shlieren by Lee et al. (2023, 2024), but its reconstruction was limited to the screech phenomena as well as that using measured velocity field.

The present study proposes a spatiotemporal superresolution measurement framework based on DMD (Schmid 2010; Tu et al. 2014) with the Kalman filter (Kalman 1960) and the Rauch-Tung-Striebel (RTS) smoother (Rauch et al. 1965). Here, DMD can estimate the linear dynamical system with respect to coherent structures that grow, decay, and oscillate. Therefore, the matrix that expresses the temporal evolution of the velocity field is derived from DMD and is used for the state equation. The Kalman filter and the RTS smoother estimate and correct the time-resolved velocity field using the acoustic information.

We applied the proposed method to the experimental data of a Mach 1.35 axisymmetric jet. First, we constructed the dual PIV systems, and paired velocity fields with a short time interval were measured for a Mach 1.35 axisymmetric jet. The dual PIV and acoustic measurements were simultaneously performed and the time-resolved velocity field is estimated. Then, the time coefficients of DMD modes in high temporal resolution are calculated to investigate the relationships between the mixing promotion by the screech phenomena and the downstream structures.

2 Calculation procedure of the spatiotemporal superresolution measurement

Figure 1 illustrates the schematic image of the proposed spatiotemporal superresolution measurement. Note that this figure does not show the actual sampling rate. Paired velocity fields captured from two high-speed cameras (using dual PIV technology) are taken with only 2.5 µs of time difference between each of them, which allows us to obtain the information about the temporal evolution of the velocity field at that precise moment. The time difference 2.5 µs is half of the time difference of the sampling rate of the microphone measurement, and the reason why we choose this time difference is related to elimination of biased errors in the measurement as discussed later. On the other hand, the time between paired velocity fields is large because of limitations of the cameras, and the dynamics of the system might be completely lost. Therefore, acoustic measurement devices, that can work at very high sampling rates, are included in the experiment and used for reconstruction of the dynamics together. It may be important to remark that the sampling rate of the acoustic measurement is sufficiently high for capturing unsteady flow. DMD allows us to extract dynamical system information from the paired velocity fields and it brings an accurate reconstruction of the flow together with acoustic information at the end of the process.

In this case, the proposed method calculates the system matrix of the velocity field based on the DMD and the observation matrix as a linear regression coefficient matrix of the velocity field and acoustic data. Those matrices are used for the system and observation equations and applied to the Kalman filter and the RTS smoother. Here, the estimated velocity fields are the same as the observed ones at the time steps when the PIV measurements were conducted. On the other hand, the velocity field is estimated and interpolated by the Kalman filter and the RTS smoother at the time steps when only microphone measurements are conducted.

Fig. 1
figure 1

Schematic image of the proposed spatiotemporal superresolution measurement

The dual PIV is used in the present study, and the paired velocity fields with small time differences which can sufficiently resolve the temporal evolution of the supersonic jet flow are captured. It should be noted that the single camera PIV typically does not meet this shutter speed requirement. Here, the use of two cameras (\(\alpha\) and \(\beta\)) for the experiment will normally introduce systematic biased errors coming from, mainly, an asymmetrically processing of the snapshots in different parts of the calculation. This can cause noise in the measurement and errors in the DMD process if not addressed properly. Therefore, a specially designed timing chart for the cameras and lasers of the dual PIV and DMD are introduced to greatly improve the results.

Figure 2 shows the timing chart of the velocity fields taken by each camera. The time intervals for each PIV system and image pairs were set to be 1 µs and 2.5 µs, respectively. This setup allows reconstruction of the flow in both directions: from the camera \(\alpha\) to the camera \(\beta\) and vice versa. This technique reduces a possible error introduced when the cameras are not perfectly aligned or have slightly different characteristics in the configuration, lens or software. The experimental technique to realize this timing chart is described in Sect. 3 in detail. The final reconstruction, as it is remarked further in this document, is conducted by merging both of the reconstruction directions, reducing any error tendency that otherwise would be amplified when the process is repeated for the whole pool of snapshots. The presented timing chart repeats itself every four velocity fields. It is remarkable that the time interval between paired velocity fields is approximately 40 times greater than the time interval between the velocity fields of a pair, due to the camera limitations.

Fig. 2
figure 2

Timing chart of dual PIV. \(\alpha\), depicted in yellow, and \(\beta\), depicted in blue

The spatiotemporal superresolution measurement procedure is explained in detail in the following Subsections. Then, the summary of the analysis procedure is shown in Appendix A.

2.1 Dimensionality reduction of PIV data by singular value decomposition

The SVD is applied to the mean-subtracted velocity fields obtained by the cameras \(\alpha\) and \(\beta\). Then, the PIV data are lowered in dimension by a truncation. This dimensionality reduction is conducted for three reasons. The first reason is to select the fluid structures of which temporal evolution can be approximated by a linear model more accurately. The second reason is that the DMD is sensitive to noisy data, and noise components are also truncated. The third reason is that the calculation cost is reduced by lowering the dimension. The convergences of the mean velocity field and the POD mode energy are shown in Appendix B.

First, the PIV velocity fields are divided into PIV data matrices \(\textbf{X}^{\alpha } \in \mathbb {R}^{2n\times N}\) and \(\textbf{X}^{\beta } \in \mathbb {R}^{2n\times N}\) (n and N are numbers of spatial points in the velocity field and pairs of velocity fields, corresponding to, in this study, 26728 and 14990, respectively), depending on the camera origin of each velocity field due to the particular camera timing chart:

$$\begin{aligned} \textbf{X}^{\alpha }= & \left[ \begin{array}{ccccc} \widetilde{\textbf{u}}(t_{F}) & \widetilde{\textbf{u}}(t_{2 F}+0.5 \varDelta t) & \widetilde{\textbf{u}}(t_{3 F}) & \cdots & \widetilde{\textbf{u}}(t_{N F}+0.5 \varDelta t) \\ \widetilde{\textbf{v}}(t_{F}) & \widetilde{\textbf{v}}(t_{2F}+0.5 \varDelta t) & \widetilde{\textbf{v}}(t_{3F}) & \cdots & \widetilde{\textbf{v}}(t_{NF}+0.5 \varDelta t) \\ \end{array}\right] , \nonumber \\ \textbf{X}^{\beta }= & \left[ \begin{array}{ccccc} \widetilde{\textbf{u}}(t_{F}+0.5 \varDelta t) & \widetilde{\textbf{u}}(t_{2F}) & \widetilde{\textbf{u}}(t_{3F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{u}}(t_{NF}) \\ \widetilde{\textbf{v}}(t_{F}+0.5 \varDelta t) & \widetilde{\textbf{v}}(t_{2F}) & \widetilde{\textbf{v}}(t_{3F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{v}}(t_{NF}) \\ \end{array}\right] , \end{aligned}$$
(1)

where \(\widetilde{\textbf{u}}\in \mathbb {R}^{n}\) and \(\widetilde{\textbf{v}}\in \mathbb {R}^{n}\) are the axial and vertical components of the velocity fluctuation field, respectively, and \(0.5 \varDelta t\) is the short time interval of the paired velocity fields corresponding to, in this study, 2.5 µs (\(\varDelta t\) is the sampling period of the acoustic measurements corresponding to 5 µs). Also, the discrete-time \(t_{k}\) \((1 \le k \le NF)\) is defined based on the sampling rate of the microphone measurements, where F is the ratio of the sampling rate of the microphone and PIV measurements, corresponding to 40. Every pair of velocity fields is composed of the velocity field at \(t_{k}\) and the following one at \(t_{k}\)+ 0.5 \(\varDelta t\), even though they are stored in different matrices.

Next, velocity components of the horizontally flipped PIV snapshots are added to the PIV data matrices. Thereby, only symmetric and asymmetric spatial modes are extracted by the SVD. This process is based on the assumption that the dominant structures of the flow can be divided into symmetric and asymmetric structures, and it helps to improve the estimation accuracy of the spatial mode.

PIV data matrices which include flipped data \(\textbf{X}_{\text {Sym}}^{\alpha } \in \mathbb {R}^{2n\times 2N}\) and \(\textbf{X}_{\text {Sym}}^{\beta } \in \mathbb {R}^{2n\times 2N}\) are obtained as follows:

$$\begin{aligned} \textbf{X}_{\text {Sym}}^{\alpha }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{F}) & \cdots & \widetilde{\textbf{u}}(t_{NF}+ 0.5 \varDelta t) & \widetilde{\textbf{u}}_{\text {flip}}(t_{F}) & \cdots & \widetilde{\textbf{u}}_{\text {flip}}(t_{NF}+0.5 \varDelta t)\\ \widetilde{\textbf{v}}(t_{F}) & \cdots & \widetilde{\textbf{v}}(t_{NF}+0.5 \varDelta t) & -\widetilde{\textbf{v}}_{\text {flip}}(t_{F}) & \cdots & -\widetilde{\textbf{v}}_{\text {flip}}(t_{NF}+0.5 \varDelta t)\\ \end{array}\right] , \nonumber \\ \textbf{X}_{\text {Sym}}^{\beta }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{u}}(t_{NF}) & \widetilde{\textbf{u}}_{\text {flip}}(t_{F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{u}}_{\text {flip}}(t_{NF})\\ \widetilde{\textbf{v}}(t_{F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{v}}(t_{NF}) & -\widetilde{\textbf{v}}_{\text {flip}}(t_{F}+0.5 \varDelta t) & \cdots & - \widetilde{\textbf{v}}_{\text {flip}}(t_{NF})\\ \end{array}\right] , \end{aligned}$$
(2)

where \(\widetilde{\textbf{u}}_{\text {flip}}\in \mathbb {R}^{n}\) and \(\widetilde{\textbf{v}}_{\text {flip}}\in \mathbb {R}^{n}\) are the axial and vertical components of the horizontally flipped PIV snapshots, respectively.

Then, the SVD is applied to both PIV data matrices which include flipped data:

$$\begin{aligned} & \textbf{X}_{\text {Sym}}^{\alpha } \approx \textbf{U}^{\alpha (r)} \mathbf {\varSigma }^{\alpha (r)} \textbf{V}^{\alpha (r) \textsf{T}}, \end{aligned}$$
(3)
$$\begin{aligned} & \textbf{X}_{\text {Sym}}^{\beta } \approx \textbf{U}^{\beta (r)} \mathbf {\varSigma }^{\beta (r)} \textbf{V}^{\beta (r) \textsf{T}}, \end{aligned}$$
(4)

where \(\textbf{U}^{\alpha (r)}, \textbf{U}^{\beta (r)}\in \mathbb {R}^{2n\times r}\) are an orthogonal matrices of the spatial modes, \(\mathbf {\varSigma }^{\alpha (r)}, \mathbf {\varSigma }^{\beta (r)}\in \mathbb {R}^{r\times r}\) are matrices of which diagonal components are singular values \(\sigma\), and \(\textbf{V}^{\alpha (r)}, \textbf{V}^{\beta (r)}\in \mathbb {R}^{2N\times r}\) are orthogonal matrices of the temporal modes. Also, r is the number of the POD modes for dimensionality reduction.

POD mode coefficients of PIV data matrices which are reduced in dimension, \(\textbf{Z}_{\text {1st}}^{\alpha }, \textbf{Z}_{\text {{1st}}}^{\beta }, \textbf{Z}_{\text {2nd}}^{\alpha },\textbf{Z}_{\text {2nd}}^{\beta } \in \mathbb {R}^{r\times 0.5N}\) are derived as follows:

$$\begin{aligned} & \textbf{Z}_{\text {1st}}^{\alpha }=\textbf{U}^{\alpha (r)\textsf{T}} \textbf{X}_{{\text {1st}}}^{\alpha }, \end{aligned}$$
(5)
$$\begin{aligned} & \textbf{Z}_{{\text {1st}}}^{\beta }=\textbf{U}^{\beta (r)\textsf{T}} \textbf{X}_{\text {1st}}^{\beta }, \end{aligned}$$
(6)
$$\begin{aligned} & \textbf{Z}_{\text {2nd}}^{\alpha }=\textbf{U}^{\alpha (r)\textsf{T}} \textbf{X}_{\text {2nd}}^{\alpha }, \end{aligned}$$
(7)
$$\begin{aligned} & \textbf{Z}_{\text {2nd}}^{\beta }=\textbf{U}^{\beta (r)\textsf{T}} \textbf{X}_{\text {2nd}}^{\beta }, \end{aligned}$$
(8)

where data matrices \(\textbf{X}_{\text {1st}}^{\alpha } \in \mathbb {R}^{2n\times 0.5N}\) and \(\textbf{X}_{\text {1st}}^{\beta } \in \mathbb {R}^{2n\times 0.5N}\) are the “1st” velocity fields of the paired velocity fields obtained by the cameras \(\alpha\) and \(\beta\), respectively:

$$\begin{aligned} \textbf{X}_{\text {1st}}^{\alpha }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{F}) & \widetilde{\textbf{u}}(t_{3F}) & \widetilde{\textbf{u}}(t_{5F}) & \cdots & \widetilde{\textbf{u}}(t_{(N-1)F}) & \\ \widetilde{\textbf{v}}(t_{F}) & \widetilde{\textbf{v}}(t_{3F}) & \widetilde{\textbf{v}}(t_{5F}) & \cdots & \widetilde{\textbf{v}}(t_{(N-1)F}) & \\ \end{array}\right] , \nonumber \\ \textbf{X}_{\text {1st}}^{\beta }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{2F}) & \widetilde{\textbf{u}}(t_{4F}) & \widetilde{\textbf{u}}(t_{NF}) & \cdots & \widetilde{\textbf{u}}(t_{NF}) & \\ \widetilde{\textbf{v}}(t_{2F}) & \widetilde{\textbf{v}}(t_{4F}) & \widetilde{\textbf{v}}(t_{NF}) & \cdots & \widetilde{\textbf{v}}(t_{NF}) & \\ \end{array}\right] , \end{aligned}$$
(9)

Then, data matrices \(\textbf{X}_{\text {2nd}}^{\alpha } \in \mathbb {R}^{2n\times 0.5N}\) and \(\textbf{X}_{\text {2nd}}^{\beta } \in \mathbb {R}^{2n\times 0.5N}\) are the “2nd” velocity fields after \(0.5 \varDelta t\) of the paired velocity fields obtained by the cameras \(\alpha\) and \(\beta\), respectively:

$$\begin{aligned} \textbf{X}_{\text {2nd}}^{\alpha }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{2F}+0.5 \varDelta t) & \widetilde{\textbf{u}}(t_{4F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{u}}(t_{NF}+0.5 \varDelta t) & \\ \widetilde{\textbf{v}}(t_{2F}+0.5 \varDelta t) & \widetilde{\textbf{v}}(t_{4F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{v}}(t_{NF}+0.5 \varDelta t) & \\ \end{array}\right] , \nonumber \\ \textbf{X}_{\text {2nd}}^{\beta }= & \left[ \begin{array}{cccccccc} \widetilde{\textbf{u}}(t_{F}+0.5 \varDelta t) & \widetilde{\textbf{u}}(t_{3F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{u}}(t_{(N-1)F}+0.5 \varDelta t) & \\ \widetilde{\textbf{v}}(t_{F}+0.5 \varDelta t) & \widetilde{\textbf{v}}(t_{3F}+0.5 \varDelta t) & \cdots & \widetilde{\textbf{v}}(t_{(N-1)F}+0.5 \varDelta t) & \\ \end{array}\right] , \end{aligned}$$
(10)

2.2 Total least-square dynamic mode decomposition

The temporal evolution of the velocity field in the short time interval is obtained, and the system matrix and the system noise covariance matrix of the Kalman filter and the RTS smoother are constructed for the spatiotemporal superresolution in Sect. 2.4.

The present study applied tlsDMD to POD mode coefficients of the velocity fields which were reduced in dimension by SVD, and the temporal evolution in the short time interval is calculated. The best-fit linear operators in the following equations, \(\textbf{A}^{0.5}_{\alpha \rightarrow \beta } \in \mathbb {R}^{r\times r}\) from the camera \(\alpha\) to the camera \(\beta\) and \(\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\in \mathbb {R}^{r\times r}\) from the camera \(\beta\) to the camera \(\alpha\) (Fig. 3) are shown as follows:

$$\begin{aligned} & \textbf{Z}_{\text {2nd}}^{\beta }=\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\textbf{Z}_{\text {1st}}^{\alpha }, \end{aligned}$$
(11)
$$\begin{aligned} & \textbf{Z}_{\text {2nd}}^{\alpha }=\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\textbf{Z}_{\text {1st}}^{\beta }, \end{aligned}$$
(12)

where the time interval between the paired POD mode coefficients is 2.5 µs. The number of DMD modes was determined in Appendix C. Also, the convergence of the best-fit linear operators when the number of snapshots is changed, is shown in Appendix B.

If standard DMD is used and the best-fit linear operators is obtained, Eqs. (11) and (12) are solved by least-squares solution, minimizing noises which are included in only \(\textbf{Z}_{\text {2nd}}^{\alpha }\) and \(\textbf{Z}_{\text {2nd}}^{\beta }\). However, actually, noises are included in \(\textbf{Z}_{\text {1st}}^{\alpha }\) and \(\textbf{Z}_{\text {1st}}^{\beta }\) as well. If they are not reduced, small errors will add up and significantly increase errors in the final reconstruction. Thus, tlsDMD is used, and the supposition that errors (\({\varDelta }\textbf{Z}_{\text {1st}}\), \({\varDelta }\textbf{Z}_{\text {2nd}}\)) can be found in both \(\textbf{Z}_{\text {1st}}\) and \(\textbf{Z}_{\text {2nd}}\) is corporated into the calculation of the best-fit linear operators. This new assumption has been proven to give more accurate results than that of standard DMD in some particular cases. The procedure form of tlsDMD takes this shape:

$$\begin{aligned} & \min _{\textbf{A}^{0.5}_{\alpha \rightarrow \beta },{\varDelta } \textbf{Z}_{\text {2nd}}^{\beta },{\varDelta }\textbf{Z}_{\text {1st}}^{\alpha }}\left\| \left( {\begin{array}{c}{\varDelta }\textbf{Z}_{\text {2nd}}^{\beta }\\ {\varDelta }\textbf{Z}_{\text {1st}}^{\alpha }\end{array}}\right) \right\| ,\; \text {subject to} \; \;\textbf{Z}_{\text {2nd}}^{\beta }\nonumber \\ & +{\varDelta }{\textbf{Z}_{\text {2nd}}^{\beta }}=\textbf{A}^{0.5}_{\alpha \rightarrow \beta } (\textbf{Z}_{\text {1st}}^{\alpha }+{\varDelta }{\textbf{Z}_{\text {1st}}^{\alpha }}). \end{aligned}$$
(13)
$$\begin{aligned} & \min _{\textbf{A}^{0.5}_{\beta \rightarrow \alpha },{\varDelta } \textbf{Z}_{\text {2nd}}^{\alpha },{\varDelta }\textbf{Z}_{\text {1st}}^{\beta }}\left\| \left( {\begin{array}{c}{\varDelta }\textbf{Z}_{\text {2nd}}^{\alpha }\\ {\varDelta }\textbf{Z}_{\text {1st}}^{\beta }\end{array}}\right) \right\| ,\; \text {subject to} \; \;\textbf{Z}_{\text {2nd}}^{\alpha }\nonumber \\ & +{\varDelta }{\textbf{Z}_{\text {2nd}}^{\alpha }}=\textbf{A}^{0.5}_{\beta \rightarrow \alpha } (\textbf{Z}_{\text {1st}}^{\beta }+{\varDelta }{\textbf{Z}_{\text {1st}}^{\beta }}). \end{aligned}$$
(14)

Refer to Hemati et al. (2015) for a detailed tlsDMD calculation method.

Fig. 3
figure 3

Best-fit linear operators, \(\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\) from the camera \(\alpha\) to the camera \(\beta\) and \(\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\) from the camera \(\beta\) to the camera \(\alpha\)

Since POD mode coefficients used for the calculation of each matrix (\(\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\) and \(\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\)) are different, the dynamical information in them may not be similar. Therefore, a new best-fit linear operator \(\textbf{A} \in \mathbb {R}^{r\times r}\) that summarizes both dynamical information is calculated as follows:

$$\begin{aligned} \textbf{A}=\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\textbf{A}^{0.5}_{\alpha \rightarrow \beta }. \end{aligned}$$
(15)

The objective of this process is to revert the asymmetricity provoked by the order in which the dataset is analyzed in the previous step of the calculation, a bias that is present in \(\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\) as well as in \(\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\). As a result, the temporal resolution of the reconstruction will reduce by half and the sampling frequency of the acoustic measurements matches the temporal resolution of \(\mathbf {{A}}\) (5 \(\mu\)s). This new best-fit linear operator \(\textbf{A}\) is used as the system matrix of the Kalman filter and the RTS smoother.

When \(\textbf{z}_{k}^{\alpha }{\in \mathbb {R}^{r}}\) is the POD mode coefficients at a time step k obtained by the camera \(\alpha\), \(\textbf{z}_{k+1}^{\alpha }{\in \mathbb {R}^{r}}\) at the next time step is estimated by multiplying \(\mathbf {{A}}\) to \(\textbf{z}_{k}^{\alpha }\) (Fig. 4) as follows:

$$\begin{aligned} \textbf{z}_{k+1}^{\alpha }=\textbf{A}\textbf{z}_{k}^{\alpha } + \textbf{v}_{k}, \end{aligned}$$
(16)

where \(\textbf{v}_{k}{\in \mathbb {R}^{r}}\) is the noise when \(\textbf{z}_{k+1}^{\alpha }\) is estimated. The algorithm adopted here, which cancels out the biased error, is very similar to one of the denoising technique of DMD: forward-backward DMD (Hemati et al. 2017). See the reference (Hemati et al. 2014) for more details.

Fig. 4
figure 4

Estimation of the POD mode coefficient at the next time step by best-fit linear operator

Next, the covariance matrix of the noise \(\textbf{v}_{k}\) in Eq. (16), \(\textbf{Q}{\in \mathbb {R}^{r\times r}}\) is calculated. \(\textbf{Q}\) indicates the reliability of the next step estimation by \(\textbf{A}\), and it is used as the system noise covariance matrix of the Kalman filter and the RTS smoother in Sect. 2.4. \(\textbf{Q}\) is defined as follows:

$$\begin{aligned} \textbf{Q} = \textrm{E} \left[ \textbf{v}_{k} {\textbf{v}_{k}}^{\textsf{T}}\right] . \end{aligned}$$
(17)

\(\textbf{Q}\) is estimated by dividing Eq. (16) into the following equations:

$$\begin{aligned} & \textbf{z}_{k+0.5}^{\alpha }=\textbf{A}^{0.5}_{\alpha \rightarrow \beta } \textbf{z}_{k}^{\alpha } + \textbf{v}^{0.5}_{\alpha \rightarrow \beta }, \end{aligned}$$
(18)
$$\begin{aligned} & \textbf{z}_{k+1}^{\alpha }=\textbf{A}^{0.5}_{\beta \rightarrow \alpha } \textbf{z}_{k+0.5}^{\alpha } + \textbf{v}^{0.5}_{\beta \rightarrow \alpha }, \end{aligned}$$
(19)

where \(\textbf{z}_{k+0.5}^{\alpha }{\in \mathbb {R}^{r}}\) is the POD mode coefficients when only \(\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\) is multiplied to \(\textbf{z}_{k}^{\alpha }\). Also, \(\textbf{v}^{0.5}_{\beta \rightarrow \alpha }{\in \mathbb {R}^{r}}\) and \(\textbf{v}^{0.5}_{\alpha \rightarrow \beta }{\in \mathbb {R}^{r}}\) are gaussian noises generated by multiplying \(\textbf{A}^{0.5}_{\beta \rightarrow \alpha }\) and \(\textbf{A}^{0.5}_{\alpha \rightarrow \beta }\), respectively. The characteristics of \(\textbf{v}^{0.5}_{\beta \rightarrow \alpha }\in \mathbb {R}^{r}\) and \(\textbf{v}^{0.5}_{\alpha \rightarrow \beta }\in \mathbb {R}^{r}\) were discussed in Appendix D.

Then, Eq. (18) is substituted into Eq. (19) as follows:

$$\begin{aligned} \textbf{z}_{k+1}^{\alpha }=\textbf{A} \textbf{z}_{k}^{\alpha } + \textbf{A}^{0.5}_{\beta \rightarrow \alpha } \textbf{v}^{0.5}_{\alpha \rightarrow \beta } + \textbf{v}^{0.5}_{\beta \rightarrow \alpha }. \end{aligned}$$
(20)

By comparing right-hand side of Eqs. (16) and  (20), we obtain

$$\begin{aligned} \textbf{v}_{k}=\textbf{A}^{0.5}_{\beta \rightarrow \alpha } \textbf{v}^{0.5}_{\alpha \rightarrow \beta } + \textbf{v}^{0.5}_{\beta \rightarrow \alpha }. \end{aligned}$$
(21)

Therefore, \(\textbf{Q}\) is represented as follows:

$$\begin{aligned} \textbf{Q} = \textbf{A}^{0.5}_{\beta \rightarrow \alpha } \mathbf {Q_{\alpha \rightarrow \beta }} {\textbf{A}^{0.5}_{\beta \rightarrow \alpha }}^{\textsf{T}} + \mathbf {Q_{\beta \rightarrow \alpha }}, \end{aligned}$$
(22)

where \(\mathbf {Q_{\alpha \rightarrow \beta }}{\in \mathbb {R}^{r\times r}}\) and \(\mathbf {Q_{\beta \rightarrow \alpha }}{\in \mathbb {R}^{r\times r}}\) are the covariance matrix of \(\textbf{v}^{0.5}_{\alpha \rightarrow \beta }\) and \(\textbf{v}^{0.5}_{\beta \rightarrow \alpha }\), respectively:

$$\begin{aligned} & \textbf{Q}_{\alpha \rightarrow \beta } = \textrm{E} \left[ \textbf{v}^{0.5}_{\alpha \rightarrow \beta } {\textbf{v}^{0.5}_{\alpha \rightarrow \beta }}^{\textsf{T}}\right] , \end{aligned}$$
(23)
$$\begin{aligned} & \mathbf {Q_{\beta \rightarrow \alpha }} = \textrm{E} \left[ \textbf{v}^{0.5}_{\beta \rightarrow \alpha } {\textbf{v}^{0.5}_{\beta \rightarrow \alpha }}^{\textsf{T}}\right] . \end{aligned}$$
(24)

\(\mathbf {Q_{\alpha \rightarrow \beta }}\) and \(\mathbf {Q_{\beta \rightarrow \alpha }}\) in Eqs. (23) and (24), respectively, are approximated by sample average as follows:

$$\begin{aligned} & \mathbf {\widehat{Q}_{\alpha \rightarrow \beta }} = \frac{1}{N-1}\mathbf {\widehat{V}}^{0.5}_{\alpha \rightarrow \beta } {\mathbf {\widehat{V}}^{0.5\textsf{T}}_{\alpha \rightarrow \beta }} \end{aligned}$$
(25)
$$\begin{aligned} & \mathbf {\widehat{Q}_{\beta \rightarrow \alpha }} = \frac{1}{N-1}\mathbf {\widehat{V}}^{0.5}_{\beta \rightarrow \alpha } {\mathbf {\widehat{V}}^{0.5\textsf{T}}_{\beta \rightarrow \alpha }}, \end{aligned}$$
(26)

whereas \(\mathbf {\widehat{V}}^{0.5}_{\alpha \rightarrow \beta }{\in \mathbb {R}^{r\times N}}\) and \(\mathbf {\widehat{V}}^{0.5}_{\beta \rightarrow \alpha }{\in \mathbb {R}^{r\times N}}\) are estimated as follows:

$$\begin{aligned} & \mathbf {\widehat{V}}^{0.5}_{\alpha \rightarrow \beta } = \textbf{Z}_{\text {2nd}}^{\beta }-\textbf{A}^{0.5}_{\alpha \rightarrow \beta } \textbf{Z}_{\text {1st}}^{\alpha }, \end{aligned}$$
(27)
$$\begin{aligned} & \mathbf {\widehat{V}}^{0.5}_{\beta \rightarrow \alpha } = \textbf{Z}_{\text {2nd}}^{\alpha }-\textbf{A}^{0.5}_{\beta \rightarrow \alpha } \textbf{Z}_{\text {1st}}^{\beta }. \end{aligned}$$
(28)

2.3 The construction of the observation matrix

The linear regression coefficient matrix of the velocity field and acoustic data is obtained, and the observation matrix and the observation noise covariance matrix of the Kalman filter and the RTS smoother are constructed for the spatiotemporal superresolution in Subsection 2.4.

The observed vector in the present study, \(\textbf{y}_{k}{\in \mathbb {R}^{n_a}}\), consists of the mean-subtracted acoustic signals \({\widetilde{a}}_i(t_k)\) (\(1\le i \le n_a\)) of microphones:

$$\begin{aligned} \textbf{y}_{k}=\left[ \begin{array}{c} {\widetilde{a}}_{1}(t_{k}) \\ {\widetilde{a}}_{2}(t_{k}) \\ \vdots \\ {\widetilde{a}}_{n_a}(t_{k}) \end{array}\right] , \end{aligned}$$
(29)

where \(n_a\) is the number of microphones, in this study, corresponding to 18.

The observation matrix \(\textbf{C}{\in \mathbb {R}^{n_a\times r}}\) is calculated as the linear regression coefficients matrix of the POD mode coefficients of PIV data \(\textbf{Z}^{\alpha } \in \mathbb {R}^{r\times N}\) and acoustic data matrix \(\textbf{M} \in \mathbb {R}^{n_a\times N}\):

$$\begin{aligned} & \textbf{C}=\textbf{M}\textbf{Z}^{\alpha \dag }, \end{aligned}$$
(30)
$$\begin{aligned} & \textbf{Z}^{\alpha }=\textbf{U}^{\alpha (r)\textsf{T}} \textbf{X}_{\text {Sym}}^{\alpha }, \end{aligned}$$
(31)
$$\begin{aligned} & \textbf{M}=\left[ \begin{array}{cccccccc} \textbf{y}_{F}&\textbf{y}_{2F}&\cdots&\textbf{y}_{NF}&\end{array}\right] . \end{aligned}$$
(32)

In addition, the observation noise covariance matrix \(\textbf{R}{\in \mathbb {R}^{n_a\times n_a}}\) is also calculated. \(\textbf{R}\) indicates the reliability of the linear regression by \(\textbf{C}\), and it is used as the observation noise covariance matrix of the Kalman filter and the RTS smoother. \(\textbf{R}\) is defined as follows:

$$\begin{aligned} \textbf{R} = \lambda \textrm{E} \left[ \textbf{w}_{k} \textbf{w}_{k}^{\textsf{T}}\right] , \end{aligned}$$
(33)

Here, \(\textbf{w}_{k}{\in \mathbb {R}^{n_a}}\) is observation noises as follows:

$$\begin{aligned} \textbf{y}_{k} =\textbf{C} \textbf{z}_{k}^{\alpha }+\textbf{w}_{k} \end{aligned}$$
(34)

Also, \(\lambda\) is the hyperparameter that alters the strength of the observation noise against that of the system noise. The condition \(\lambda =1\) corresponds to the estimation based on the simple sample average, but it may not give us the best results for the spatiotemporal superresolution. Therefore, \(\lambda\) is optimized by cross-validation as shown in Appendix C.

\(\textbf{R}\) in Eq. (33) is approximated by sample average as follows:

$$\begin{aligned} \mathbf {\widehat{R}} = \frac{\lambda }{N}\mathbf {\widehat{W}}_{k} \mathbf {\widehat{W}}_{k}^{\textsf{T}}, \end{aligned}$$
(35)

whereas observation noises \(\mathbf {\widehat{W}}_{k}{\in \mathbb {R}^{n_a\times N}}\) are estimated as follows:

$$\begin{aligned} \mathbf {\widehat{W}}_{k} = \textbf{M}-\textbf{C} \textbf{Z}^{\alpha }. \end{aligned}$$
(36)

2.4 The Kalman filter and the RTS smoother

The spatiotemporal superresolution measurement is conducted by estimating and interpolating the POD mode coefficient by the Kalman filter and the RTS smoother at the time steps when only microphone measurements are conducted.

First, the Kalman filter considers the linear system and observation equations as follows:

$${\mathbf{z}}_{{k + 1}}^{\alpha } = {\mathbf{Az}}_{k}^{\alpha } + {\mathbf{v}}_{k} ,$$
(37)
$${\mathbf{y}}_{k} = {\text{ }}{\mathbf{Cz}}_{k}^{\alpha } + {\mathbf{w}}_{k} ,$$
(38)

where \(\textbf{z}_{k}^{\alpha }\), \(\textbf{z}_{k+1}^{\alpha }\), \(\textbf{y}_{k}\), \(\textbf{A}\), \(\textbf{C}\), \(\textbf{v}_{k}\) and \(\textbf{w}_{k}\) in Subsections 2.2 and 2.3 are used as the state vector at the kth time step and at the \((k +1)\)th time step, observed vector at the kth time step, the system and observation matrices, and the system and observed noises, respectively.

The prediction step is expressed using the system noise covariance matrix \(\textbf{Q}\) and the system matrix \(\textbf{A}\) as the following equations:

$$\widehat{{\mathbf{z}}}_{{k + 1}}^{{\alpha ^{ - } }} = {\text{ }}{\mathbf{A}}\widehat{{\mathbf{z}}}_{k}^{\alpha } ,$$
(39)
$${\mathbf{P}}_{{k + 1}}^{ - } = {\text{ }}{\mathbf{AP}}_{k} {\mathbf{A}}^{T} + {\mathbf{Q}},{\text{ }}$$
(40)

where \(\widehat{\textbf{z}}_{k}^{\alpha }{\in \mathbb {R}^{r}}\), \(\widehat{\textbf{z}}_{k+1}^{\alpha ^{-}}{\in \mathbb {R}^{r}}\), \(\textbf{P}_{k}{\in \mathbb {R}^{r\times r}}\) and \(\textbf{P}_{{k+1}}^{-}{\in \mathbb {R}^{r\times r}}\) are the estimate of the state vector at kth time step, the prior estimate of the state vector at \((k+1)\)th time step, the error covariance matrix at kth time step and the prior estimate of the error covariance matrix at \((k+1)\)th time step, respectively.

The filtering step can be expressed using the observation noise covariance matrix \(\textbf{R}\) and the observation matrix \(\textbf{C}\):

$${\mathbf{K}}_{k} = {\text{ }}{\mathbf{P}}_{{k + 1}}^{ - } {\mathbf{C}}^{T} ({\mathbf{CP}}_{{k + 1}}^{ - } {\mathbf{C}}^{T} + {\mathbf{R}})^{{ - 1}} ,$$
(41)
$$\widehat{{\mathbf{z}}}_{{k + 1}}^{\alpha } = {\text{ }}\widehat{{\mathbf{z}}}_{{k + 1}}^{{\alpha ^{ - } }} + {\mathbf{K}}_{k} ({\mathbf{y}}_{k} - {\mathbf{C}}\widehat{{\mathbf{z}}}_{{k + 1}}^{{\alpha ^{ - } }} ),$$
(42)
$${\mathbf{P}}_{{k + 1}} = {\text{ }}{\mathbf{P}}_{{k + 1}}^{ - } + {\mathbf{K}}_{k} {\mathbf{CP}}_{{k + 1}}^{ - } ,$$
(43)

where \(\textbf{K}_{k}{\in \mathbb {R}^{r\times n_a}}\) is the Kalman gain at the kth time step.

Then, the estimated state vector is smoothed by applying the RTS smoother. The Kalman filter estimates the state vector by using only information at the past time steps. On the other hand, the RTS smoother smoothes the state vector by using the information at all time steps in the interval. Therefore, the accuracy of the spatiotemporal superresolution is improved.

The smoothing process is shown by the following equations:

$$\widehat{{\mathbf{z}}}_{{k|k_{{{\text{end}}}} }} = {\text{ }}\widehat{{\mathbf{z}}}_{k} + {\mathbf{B}}_{k} (\widehat{{\mathbf{z}}}_{{k + 1|k_{{{\text{end}}}} }} - {\mathbf{A}}\widehat{{\mathbf{z}}}_{k} ),$$
(44)
$${\mathbf{P}}_{{k|k_{{{\text{end}}}} }} = {\text{ }}{\mathbf{P}}_{k} + {\mathbf{B}}_{k} ({\mathbf{P}}_{{k + 1|k_{{{\text{end}}}} }} - {\mathbf{AP}}_{k} {\mathbf{A}}^{T} - {\mathbf{Q}}){\mathbf{B}}_{k}^{T} ,$$
(45)
$${\mathbf{B}}_{k} = {\text{ }}{\mathbf{P}}_{k} {\mathbf{A}}^{T} {\mathbf{P}}_{{k + 1}}^{{ - ^{{ - 1}} }} .{\text{ }}$$
(46)

where \(\widehat{\textbf{z}}_{k|k_\text {end}}\in \mathbb {R}^{r}\), \(\textbf{P}_{k|k_\text {end}}\in \mathbb {R}^{r \times r}\) and \(\textbf{B}_k\in \mathbb {R}^{r \times r}\) are the final estimates of the state vector and the error covariance matrix by the RTS smoother, and the smoothing gain. Also, \(k_\text {end}\) is the last time step in the filtering and smoothing time interval. The Kalman filter and the RTS smoother are applied to each time duration between the ‘\(\alpha\)’ images of two consecutive image pairs (not between paired images), and the information given by the PIV images is assumed to be the ground truth by setting the very low noise covariance for the PIV images.

2.5 Extraction of time histories of dominant structures in high temporal resolution

In this section, the time histories of dominant structures in the supersonic jet at the same temporal resolution as the acoustic data are extracted to investigate correlations between them.

The dominant spatial structures which oscillate with characteristic frequencies and growth/decay rates are obtained by projecting DMD modes to the velocity field as shown in Eq. (47), because DMD is applied to POD mode coefficients of the velocity field in this study.

$$\begin{aligned} \mathbf {\Psi }=\textbf{U}^{\alpha (r)} \mathbf {\Phi }, \end{aligned}$$
(47)

where \(\mathbf {\Phi } \in \mathbb {R}^{r\times r'}\) is the DMD modes which are obtained by applying eigenvalue decomposition to the best-fit linear operator \(\textbf{A}\), and \(\mathbf {\Psi } \in \mathbb {R}^{2n\times r'}\) is the DMD modes projected to the velocity field which show dominant spatial structures. Also, \(r'\) is the number of DMD modes.

The time histories of these DMD modes projected to the velocity field can be extracted by calculating DMD mode coefficients of them. The time-resolved velocity field data matrix \(\textbf{X}\in \mathbb {R}^{2n \times NF}\) is the superposition of products of projected DMD modes \(\mathbf {\Psi }_{m} \in \mathbb {R}^{2n}\) and DMD mode coefficients \(\mathbf {L_{m}}\in \mathbb {R}^{1 \times NF}\), as follows:

$$\begin{aligned} & \textbf{X}=\mathbf {\Psi } \textbf{L}, \end{aligned}$$
(48)
$$\begin{aligned} & \textbf{X}=\sum _{m=1}^{r'} \mathbf {\Psi _{m}} \textbf{L}_m, \end{aligned}$$
(49)

where m is the mode number. Thus, DMD mode coefficients show the time coefficients of dominant structures.

The time-resolved DMD mode coefficient matrix \(\textbf{L}\in \mathbb {R}^{r'\times NF}\) can be calculated by projecting time-resolved POD mode coefficients \(\textbf{Z}_{\text {kalman}} \in \mathbb {R}^{r\times NF}\) obtained by the Kalman filter and the RTS smoother into DMD mode space as follows:

$$\begin{aligned} \textbf{L}={\mathbf {\Phi }}^{\dagger } \textbf{Z}_{\text {kalman}}, \end{aligned}$$
(50)

3 Experimental apparatus

3.1 Jet generating system

The present experiments were performed using a device that generates a supersonic jet installed in an anechoic room provided by Tohoku University. Figure 5 shows a schematic image of this device: high-pressure air is stored inside a high-pressure tank by using a compression pump, being all these systems located upstream. Also, a plenum chamber is located at the center of the anechoic room, being the stagnation point of the jet flow. The jet is ejected vertically to the ceiling and the flow speed is changed by adjusting the ratio between the pressure in the plenum chamber and the atmospheric pressure in the anechoic room. Refer to Ozawa et al. (2020a, 2020b) for more details about the experimental facilities. Figure 6 shows the cross-sectional shape of the convergent nozzle, which was used for reproducing an underexpanded supersonic jet. The nozzle exit diameter is 10 mm and the contour of the nozzle was designed based on the reference (André et al. 2013). The nozzle pressure ratio was 2.14 corresponding to the Mach 1.35, and the Reynolds number based on the nozzle exit diameter was \(3.37 \times 10^5\). The stagnation temperature was 297 K.

Fig. 5
figure 5

Schematic image of the supersonic jet generating device

Fig. 6
figure 6

Cross-sectional shape of the convergent nozzle

3.2 Measurement system

Figure 7 shows the experimental setup which is composed of a dual PIV system (two pairs of cameras and lasers) and near-field acoustic measurements (18 microphones). Table 1 summarizes the spatial and temporal resolution of the measurement. Here, the ratio between the acoustic measurements and PIV measurements sampling rate was set to be \(F=40\).

Table 1 Parameters of the measurements

3.2.1 Synchronization of a dual PIV measurement and acoustic measurements

Three function generators (WF1974, NF) and a delay generator (DG535, SRS) were used for synchronization of a dual PIV measurement and acoustic measurements. Figure 8 shows the generation and flow of trigger and synchronous signals in the present study. The blue and red lines show the trigger signal path and synchronous signal path, respectively. Here, the function generator 3 and the delay generator play the same role as each other, and they generate signals with delays after they receive trigger signals from the channels 1 and 2. Figure 9 shows the timing chart of exposure times of camera frames and laser pulses. The delays A and C were adjusted so that the first and second laser pulses are in the exposure time of first and second frames, respectively. The delays B and D were set to the time interval for each PIV system, 1 µs. Each microphone is connected to a data acquisition (DAQ) system through an amplifier, and the acoustic measurements are synchronized with the PIV measurements by the synchronous signal.

3.2.2 Dual PIV system

The dual PIV system was built for acquisition of the short time interval paired velocity fields. Two double-pulsed lasers (LDY-300PIV, Litron, and DM60-527, Photonics industries) with wavelengths of 532 and 527 nm, respectively, and two high-speed cameras (Phantom V1840 and V2640, Vision Research) were used and placed in the anechoic room. The cameras, situated symmetrically around the nozzle, are embedded with a band pass filter. The wavelengths of the band pass filter are 532±1.5 and 527±5 nm, respectively, and they correspond to the wavelength of the lasers. In this way, each laser only interferes with its assigned camera. Both laser sheets are oriented vertically and situated over the center of the nozzle exit.

Here, the time chart of the velocity fields taken by each camera shown in Fig. 9 is explained. The time intervals for each PIV system and image pairs were set to be 1 µs and 2.5 µs, respectively. The signal colored by light red is repeated every 400 µs. In other words, the function generator 2 generates the colored signal at 2500 Hz. However, two velocity fields are obtained by the colored signal in each PIV system, and the temporal resolution of PIV is 5000 Hz, as shown in Table 1. Here, the colored signal was created by using arbitrary waveform creation software of the function generator.

3.2.3 Acoustic measurement system

Near-field acoustic measurements are simultaneously performed in the experiment using 18 microphones (TYPE4158N, ACO), as shown in Fig. 7. Six microphones arranged in a hexagon were installed in three streamwise positions. The streamwise positions \(x_{m}\) of three layers were set to \(x_{m}/D\) = 0, 3 and 12, respectively, and the radial distances \(r_m\) between the nozzle center axis and the microphones on three layers were set to \(r_m/D\) = 2, 8 and 8, respectively. These positions prevent the microphones from interfering the camera view. The signals from the microphones are intensified by the amplifiers (TYPE5006/4, ACO) and recorded using data acquisition system (USB-6366, National Instruments).

Fig. 7
figure 7

Experiment setup: a Dual PIV measurement. b Microphone placements

Fig. 8
figure 8

Synchronization of a dual PIV measurement and acoustic measurements by function generators and a delay generator

Fig. 9
figure 9

Timing chart of dual PIV. \(\varvec{\alpha }\), depicted in yellow, and \(\varvec{\beta }\), depicted in blue

4 Results and discussion

4.1 Acoustic results

Figure 10 shows the power spectral density (PSD) of the sound pressure level (SPL) of 18 microphones. Black, blue, and red lines are the PSD measured at the microphone positions \((x_{m}/D,r_{m}/D)=(0,2), (3,8), (12,8)\), respectively. Here, the Hann window was used for PSD calculation, and the window size is 1024. Also, the results of the Fourier analysis were averaged 1000 times, whereas the overlap ratio of windows is 50 %. The frequency is nondimensionalized as the Strouhal number \(St = fD/U_j\) where \(U_j=397\) m/s and \(D=\)10 mm are the jet velocity and diameter at the nozzle exit, respectively. The distinct peak of screech is observed at 12.9 kHz (\(St=0.66\)) and the second and third harmonics are observed.

Fig. 10
figure 10

Acoustic spectrum at the nozzle exit \((x{_m}/D,r{_m}/D)=(0,2), (3,8), (12,8)\)

4.2 PIV results

The PIV velocity fields are estimated by using the recursive cross-correlation method, reducing the correlation window size from \(32 \times 32\) to \(8 \times 8\). As a result, the final spatial resolution of the estimated velocity fields was \(104 \times 257\) for the particle images of which spatial resolution is \(800 \times 2048\). Figure 11 shows the basic characteristics of the velocity field of a Mach 1.35 supersonic jet. The velocity field is nondimensionalized using the jet velocity at the nozzle exit \(U_j=397\) m/s derived from the equation of the isentropic flow. The mean velocity field shows the potential core and the shear layer development of the jet. The velocity fluctuation in the potential core is due to the occurrence of the shock cells. The standard deviation and the Reynolds stress are distributed mainly in the shear layer region, as expected. Since the observed characteristics are consistent with the previous findings, the dual PIV system works well for the present study.

Fig. 11
figure 11

Basic characteristics of the velocity field of a Mach 1.35 supersonic jet

Fig. 12
figure 12

Summary of the obtained DMD modes: a Growth rate of Eigenvalues. b Amplitude spectrum and spatial distributions of the DMD modes

The tlsDMD analysis (Dawson et al. 2016) was applied to the paired velocity field taken with the short time interval of 2.5 µs. Figure 12 summarizes the results of DMD. Figure 12a shows that the growth rates of most DMD modes are near 1, which indicates that these modes are purely oscillatory modes. Figure 12b shows the amplitude spectrum of microphones and spatial distributions in the axial and vertical direction of the DMD modes projected to the velocity field. Here, one of each pair of DMD modes of which eigenvalues are in the complex conjugate relationship is shown. This figure illustrates that the flapping mode 1 was obtained near the screech frequency of the microphone measurements and that this mode is estimated to be the structure relating to the screech tone. This result is consistent with the previous findings, which say that the screech tone associated with the flapping mode is dominant at Mach 1.35 (Tam 1995). Then, the modes 3, 5, 7 and 10 which are estimated to be components of the large-scale turbulent structure can be confirmed in the range of 1.0 to 5.0 kHz, and the mode 12 which is estimated to be the streak structure can be confirmed at the frequency near 0 Hz.

4.3 Reconstructed PIV result in spatiotemporal superresolution

Figures 13 and 14 depict the snapshots of the superresolved axial velocity field and velocity fluctuation field. They depict snapshots in the different time zones. Figures 13 and 14 illustrate snapshots from \(t=0\) µs and \(t=8000\) µs, respectively. The smooth convection of the large-scale structure on the downstream side and the structure relating to the screech tone on the upstream side can be observed, while the POD-based spatiotemporal superresolution measurement (Ozawa et al. 2021) cannot estimate such large-scale structures. Therefore, the proposed method is effective to reconstruct the entire flow fluctuation because the DMD modes express the linear dynamical system of the velocity fields.

Also, the strength of the structure relating to the screech tone in snapshots from t=8000 µs is observed to be stronger than the one in snapshots from t=0 µs by comparing velocity fluctuation fields in different time zones. This is because the structure relating to the screech tone rotates around the streamwise axis. The strength of the observed structure looks strong when it is in the PIV measurement plane, and the strength looks weak when it is out of the PIV measurement plane. The rotation of the structure relating to the screech tone is mentioned in the previous work (Lee et al. 2024). Next, most of the downstream region in snapshots from \(t=8000\) µs is observed to be accelerated, while most of the downstream region in snapshots from t=0 µs is shown to be decelerated by comparing velocity fluctuation fields in different time zones. In addition, the width of the mixing layer in snapshots from \(t=8000\) µs seems to be wider than the one in snapshots from \(t=0\) µs by comparing velocity fields in different time zones.

Fig. 13
figure 13

Snapshots of the spatiotemporal superresolved axial velocity components from t=0 µs

Fig. 14
figure 14

Snapshots of the spatiotemporal superresolved axial velocity components from t=8000 µs

The accuracy of the proposed method was evaluated by downsampling the velocity fields that were originally acquired in the PIV measurement. The downsampled data was used for calculating system and observation matrices, while data other than the downsampled data was used as the reference data for calculating the reconstruction errors. The downsampling rates \(f_{\text {dwn}}\) were set to 1/2, 1/4, 1/8, and 1/16 of full data, and the ratios of the sampling rates of the microphone and PIV measurements F are 80, 160, 320, and 640, respectively. Then, three kinds of reconstruction errors \(E_1\), \(E_2\), and \(E_3\) were calculated by using the following equation:

$$\begin{aligned} & E_1 = \frac{\Vert {\hat{\textbf{Z}}}-{\textbf{Z}}\Vert ^2_F}{ \Vert {\textbf{Z}} \Vert ^2_F } \times 100, \end{aligned}$$
(51)
$$\begin{aligned} & E_2 = \frac{\Vert {\hat{\textbf{X}}}-{\textbf{X}}\Vert ^2_F}{ \Vert {\textbf{X}} \Vert ^2_F } \times 100, \end{aligned}$$
(52)
$$\begin{aligned} & E_3 = \frac{\Vert {\hat{\textbf{X}}_\text {mean+}}-{\textbf{X}_\text {mean+}}\Vert ^2_F}{ \Vert {\textbf{X}_\text {mean+}} \Vert ^2_F } \times 100, \end{aligned}$$
(53)

where \(\textbf{Z}{\in \mathbb {R}^{r \times N_{\text {dwn}}}}\), \(\textbf{X}\in \mathbb {R}^{2n \times N_{\text {dwn}}}\), and, \(\textbf{X}_\text {mean+}\in \mathbb {R}^{2n \times N_{\text {dwn}}}\) are data other than downsampled data which were originally acquired by PIV measurement, respectively. Also, \({\hat{\textbf{Z}}}{\in \mathbb {R}^{r \times N_{\text {dwn}}}}\), \({\hat{\textbf{X}}}\in \mathbb {R}^{2n \times N_{\text {dwn}}}\), and \({\hat{\textbf{X}}_\text {mean+}}\in \mathbb {R}^{2n \times N_{\text {dwn}}}\) are the estimated POD mode coefficients, velocity fluctuation, and velocity fields (mean + fluctuation), respectively. Here, \(N_{\text {dwn}}\) is the number of data other than downsampled data, corresponding to \(N-N f_{\text {dwn}}\).

The originally acquired data are also lowered in dimension in Eq. (51), indicating the estimation accuracy of the low-dimensional data itself. On the other hand, \(E_2\) and \(E_3\) include not only the reconstruction error of the low-dimensional data itself but also the errors due to the reduction in dimension by the SVD.

In this study, the numbers of both POD and DMD modes were set to 14, and \(\lambda\) = 1 by considering reconstruction errors \(E_1\) and \(E_2\). A detail is provided in Appendix C. Figure 15 shows the reconstruction errors when the downsampling rates were 1/2, 1/4, 1/8, and 1/16. As a result, the reconstruction errors \(E_1\), \(E_2\), and \(E_3\) when the downsampling rate is 1/2 are the minimum (78.92 %, 92.77 %, and 22.81 %, respectively). The errors can be further decreased when the downsampling rate is 1/1 of the full data. The linear estimations of errors \(E_1\), \(E_2\), and \(E_3\) at the downsampling rate of 1/1 are 67.72 %, 89.15 %, and 21.89 %, respectively. These reconstruction errors are significantly lower than the reconstruction error \(E_1\) (96.69 %) when the spatiotemporal superresolution measurement without a dynamical system was conducted only by the linear least-square regression of velocity field and acoustic data with (Nishikori 2022). Ozawa et al. (2022), Lee et al. (2023) and Lee et al. (2024) improved the estimation accuracy of the linear regression model by selecting POD modes of a time-delay embedded microphone data matrix of which correlation with the PIV or BOS data is high. However, the fluid structure which could be reconstructed was limited to the structure relating to the screech tone which has a high correlation with the acoustic data. On the other hand, the proposed method successfully reconstructed structures that do not have high correlations with the acoustic data like the large-scale turbulent structure downstream.

Fig. 15
figure 15

Reconstruction errors \(E_1\), \(E_2\), and \(E_3\) of all the estimated velocity fields at the time steps when the downsampling rates were 1/2, 1/4, 1/8, and 1/16 (the numbers of both POD and DMD modes are 14, and \(\lambda\) = 1)

4.4 Correlation between DMD mode coefficients

The correlations between DMD mode coefficients are calculated to clarify the influence of the mixing promotion by screech tones on the large-scale turbulent structure and the streak structure downstream.

The DMD mode coefficient is composed of complex numbers, and it oscillates at each DMD mode frequency with the amplitude fluctuating. When we look at the real or imaginary part of the DMD mode coefficient, it also oscillates at each DMD mode frequency with the amplitude fluctuating as well. On the other hand, information about only amplitude fluctuation can be extracted when we look at the absolute values of DMD mode coefficients. Figure 16 shows the real part and absolute values of the complex DMD mode coefficient (the mode 1). The real part oscillates at the screech first peak (12.4 kHz) and the absolute values also oscillates at a lower frequency than the real part. This oscillation of the absolute values is due to the rotation around the streamwise axis of the structure relating to the screech tone, as mentioned earlier. The absolute values are high when the structure relating to the screech tone is in the PIV measurement plane. On the other hand, the absolute values are low, when it is out of the PIV measurement plane.

Fig. 16
figure 16

Real part and the absolute values of the DMD mode coefficient (the mode 1)

Table 2 shows the correlation coefficients between the absolute values of the DMD mode coefficient of the mode 1 and the ones of other modes. This result illustrates that correlation coefficients with the modes 3, 5, 7 and 10 are low. These modes are dominant structures which construct the large-scale turbulent structure. Thus, it shows that the amplitudes of large-scale turbulent structures are not influenced by change in the amplitude of the structure relating to screech tone. On the other hand, the correlation coefficient with the mode 12 is relatively high. This mode is the streak structure which represents acceleration and deceleration of the flow field downstream, and it shows that the amplitude of the acceleration and deceleration is influenced by change in the strength of the structure relating to screech tone to some degree. This result implicates that there may be some relationship between the structure relating to the screech tone and the streak structure.

Table 2 Correlation coefficients with other absolute values of the DMD mode coefficients

Hereafter, we focus on the relationship between the structure relating to the screech tone (the mode 1) and the streak structure (the mode 12). Figures 17 and 18 show the covariance between the absolute DMD values of the mode coefficient of the mode 1 and the time history of the velocity field component composed of only streak structure at each spatial point. The time history is obtained from the real part of the velocity field data matrix composed of only the mode 12. Based on Eq. (49), this data matrix is calculated as follows:

$$\begin{aligned} \textbf{X}_{\text {PIV}}^{m=12}=\mathbf {\Psi }_{12} \textbf{L}_{12}, \end{aligned}$$
(54)

where \(\textbf{X}_{\text {PIV}}^{m=12}\) is the velocity field data matrix composed of only the mode 12. The actual velocity field relating to only the streak structure is composed of the modes 12 and 13. Here, the mode 13 is the DMD mode of which eigenvalue is in a complex conjugate relationship with the eigenvalue of the mode 12. However, the following relationship consists:

$$\begin{aligned} 2 \textrm{Re} (\mathbf {\Psi }_{12} \textbf{L}_{12})= \mathbf {\Psi }_{12} \textbf{L}_{12} + \mathbf {\Psi }_{13} \textbf{L}_{13}, \end{aligned}$$
(55)

where \(\textrm{Re}( \cdot )\) denotes the real part of the complex matrix. Therefore, the real part of \(\textbf{X}_{\text {PIV}}^{m=12}\) is used and the time history of the velocity field component composed of only streak structure is obtained.

Figure 17a shows the covariance with the time history of the axial component of the velocity field at each spatial point. This result represents that the positive streak structure is generated when the amplitude of the structure relating to the screech tone is large in the PIV measurement plane. In other words, the downstream flow is accelerated in the mainstream direction when the structure relating to screech is in-plane, and vice versa (Fig. 17b). Then, Fig. 18a shows the covariance with the vertical component of the velocity field at each spatial point. This result represents that the downstream flow is accelerated in the radial direction when the structure relating to screech is in-plane, and vice versa (Fig. 18b). Here, the acceleration and deceleration in the radial direction represents the movement of the streamwise vortice, and the acceleration in the radial direction corresponds to the mixing promotion. This is consistent with findings of previous researches that screech tones promote the mixing in downstream due to the streamwise vortices (Alkislar et al. 2003; Husain and Hussain 1993). In addition, the relationship of acceleration/deceleration between the mainstream and the radial directions represents the lift-up mechanism that lifting of high-/low-velocity fluid from the center/outer jet induces positive/negative streaks, respectively (Pickering et al. 2020). Therefore, the in-plane structure relating to screech tone causes the lift-up of the high-velocity fluid from the jet center (mixing promotion), and the lift-up accelerates the downstream by lift-up mechanism.

Fig. 17
figure 17

Relationship between the structure relating to screech tone and the streak structure (axial component): a covariance with the time history of the axial component of the velocity field at each spatial point. b Acceleration/deceleration in the mainstream direction of downstream depending on the screech position

Fig. 18
figure 18

Relationship between the structure relating to screech tone and the streak structure (vertical component): a covariance with the vertical component of the velocity field at each spatial point. b Acceleration/deceleration in the radial direction of downstream depending on the screech position

Next, the relationship between the amplitude of the structure relating to the screech tone and the acceleration/deceleration of the streak structure is evaluated from the other point of view. Figure 19a shows the absolute values of the DMD mode coefficient of the mode 1 which represents the time history of the amplitude of the structure relating to the screech tone. Figure 19b shows the real part of the DMD mode coefficient of the mode 12 which represents the time history of acceleration/deceleration of the streak structure. Then, the correlation between the two time histories is calculated. Here, the phase of the real part of the mode 12 coefficient should be adjusted because the phase varies depending on the axis into which the complex value is projected, including the real axis and the imaginary axis. When the phase of the real part of the mode 12 coefficient is changed by \(\theta\), the real part of the mode 12 coefficient is rewritten as follows:

$$\begin{aligned} \textrm{Re}(\textbf{Z}_\mathrm{{DMD}}^{12} \cdot e^{i \theta } )=\textrm{Re}(\textbf{Z}_\mathrm{{DMD}}^{12}) \cos \theta -\textrm{Im}(\textbf{Z}_\mathrm{{DMD}}^{12}) \sin \theta , \end{aligned}$$
(56)

where \(\textrm{Im}(\cdot )\) denotes the imaginary part of the complex matrix.

Here, the correlation coefficient between the absolute DMD mode coefficient of the mode 1, \(|\textbf{Z}_\mathrm{{DMD}}^{1}|\) and the real part of the DMD mode coefficient of the mode 12 with the phase changed, \(\textrm{Re}(\textbf{Z}_\mathrm{{DMD}}^{12} \cdot e^{i \theta } )\) is the function of \(\theta\). Thus, \({\theta }_{\max }\) when the correlation coefficient is maximized is solved by calculating the derivative of the correlation coefficient. As a result, the correlation coefficient is maximized when \(\theta\) is 0.1097 (Fig. 20) and the maximized correlation coefficient is 0.797. This high correlation coefficient shows that the period of change in amplitude of the structure relating to the screech tone corresponds to the period of acceleration/deceleration of the streak structure.

Fig. 19
figure 19

Correlation between the amplitude of the structure relating to the screech tone and acceleration/deceleration of the streak structure: a Time history of the amplitude of the structure relating to the screech tone. b Time history of acceleration/deceleration of the streak structure

Fig. 20
figure 20

Correlation coefficient when the phase of the real part of the mode 12 coefficient is changed by \(\theta\)

5 Conclusions

The present study applies a framework of the spatiotemporal superresolution measurement based on tlsDMD, the Kalman filter and the Rauch-Tung-Striebel smoother to a Mach 1.35 axisymmetric underexpanded supersonic jet. The dual PIV system was constructed and paired velocity fields of the flow with a short time interval at the sampling rate of 5000 Hz were obtained. Also, high-frequency acoustic measurements of 200,000 Hz were simultaneously conducted. Then, the temporal development information was obtained by applying tlsDMD to the POD mode coefficients of the paired velocity fields. Finally, the superresolved velocity fields of the supersonic jet were reconstructed at the same sampling rate as that of the acoustic measurements (200,000 Hz) by interpolating the temporal development information and acoustic data into the Kalman filter and the RTS smoother. The superresolved result illustrates the smooth convection of the large-scale turbulent structure on the downstream side as well as that of the structure relating to the screech tone on the upstream side, while the previous POD-based spatiotemporal superresolution measurement cannot estimate such large-scale structures. The reason is that the proposed method can reconstruct the structures which do not have high correlation with the acoustic data. In addition, from the tlsDMD results, the DMD mode 1 which is estimated to be the structure relating to the screech tone was obtained near the screech frequency of the microphone measurements. In addition, the DMD modes 3, 5, 7, and 10 are estimated to be components of the large-scale turbulent structure, and the DMD mode 12 is estimated to be the streak structure.

Next, the time coefficients of the DMD modes were calculated from the POD mode coefficients which are time-resolved by the Kalman filter and the RTS smoother. Then, the absolute value of the DMD mode coefficient of the mode 1 which shows the time history of the amplitude of the structure relating to the screech tone was calculated. The covariance distribution of it with the time history of the velocity field which is composed of only the mode 12 shows that the acceleration of the downstream in the axial and radial direction occurs when the structure relating to the screech tone is in the PIV measurement plane, and vice versa. It shows that the screech causes the mixing promotion of the downstream, and this mixing promotion accelerates the downstream in the axial direction by lift-up mechanism. Also, the real part of the DMD mode coefficient of the mode 12 which shows the time history of acceleration/deceleration of the streak structure, was calculated. The correlation coefficient between the absolute values of the DMD mode coefficient of the mode 1 and the real part of the DMD mode coefficient of the mode 12 is 0.797 when the phase of the real part is changed by 0.1097. It shows that the period of change in amplitude of the structure relating to the screech tone corresponds to the period of acceleration/deceleration of the streak structure.

In this study, nonlinear fluid dynamics is estimated by using a linear dynamical model. This is based on the assumption nonlinear dynamics in short time intervals can be approximated by the linear model. However, the estimation accuracy became worse when the number of the POD modes used for the reconstruction was larger as shown in Appendix C. It shows that the estimation of higher-frequency turbulent structures is difficult in the proposed method. This is a reason why the fine-scale turbulent structure relating to the broadband shock-associated noise could not be reconstructed although the large-scale turbulent structure and the structure relating to the screech tone were successfully reconstructed. Therefore, the fluid structures that can be reconstructed are limited to the structures with lower frequency compared to the short time interval of the paired data. This is the limitation of the proposed method. Therefore, in the future, we will focus on the development of the spatiotemporal superresolution measurement with the nonlinear dynamical model which can estimate the nonlinear fluid dynamics.