1 Introduction

Optical measuring methods are welcomed for their advantages provided by their noncontact and noninterference characteristics. Moreover, optical velocimetry includes interference velocimetry and noninterference velocimetry. Many researchers have focused on interference velocimetry, such as laser Doppler velocimetry (LDV) [1], laser speckle velocimetry (LSV) [2] and photon Doppler velocimetry (PVD) [3]. LDV is already widely applied in various fields of science and engineering due to its high spatial resolution and high accuracy [4]. However, interference velocimetry requires a laser for operation, which introduces a measure of inconvenience, while noninterference techniques have the advantage of requiring only a white light source. The optical spatial filtering velocimetry (SFV) method was initially proposed by Ator in 1963 and was used to measure the displacement [5]. Jakobsen and colleagues proposed an optical SFV sensor for measuring in-plane vibrations using a laser diode (LD) at the submicron level [6]. Non-optical SFV is also used to measure the velocity of solid particles [7].

Optical SFV utilizes the phenomenon by which modulated scattered light from a moving object can help estimate the object’s velocity; furthermore, the velocity is proportional to the frequency of the acquired signal. SFV is effective even without tracing the exact position of the object; thus, SFV is promising for applications involving complex conditions such as flow velocity measurement [8] and vehicle velocity measurement [9]. However, SFV did not initially earn much attention, despite exhibiting similar performance to that of LDV. Gradually, SFV has been appreciated for its simplicity in optical and mechanical structure and has already been used in certain science and engineering fields [10, 11].

In industry, by monitoring the velocity of a high-voltage circuit breaker, the breakdown of a power system can be prevented. However, currently, tests on circuit breakers are mostly conducted off-line. Furthermore, inappropriate mounting and dismounting of a circuit breaker during testing account for 10% of all reported malfunctions [12]. Therefore, it is necessary to propose a noncontact real-time velocity measurement system.

The acceleration of the circuit breaker we measured in this study reaches up to 20 m/s2, and the velocity increases from zero to approximately 2.8 m/s, which introduces certain challenges, including a mismatch between the low-frequency and high-frequency change rates. The time resolution of measurement should be at least 100 µs to recognize undesired circuit breaker behavior. The requirement of continuous measurement also introduces a strict requirement on the behavior of frequency spectra, because corruption of the signal spectrum causes velocity measurements to be incorrect. As indicated by literature, few studies on optical SFV provide a continuous velocity curve with respect to time.

In this manuscript, we mainly discuss the \(f(x,y)\) continuous value of the one-dimensional velocity of a high-voltage circuit breaker and its accuracy. This paper is organized as follows. The basic principle of the spatial filtering method and the principle of selecting a spatial filter are discussed in Sect. 2. Section 3 describes the principles of signal processing, in which a CMOS is used as an adaptive spatial filter and photodetector. The experiment to measure the velocity of the circuit breaker is described in Sect. 4, and analyses based on an RTSF and on an STSF are discussed. Finally, conclusions are drawn in Sect. 5.

2 Principle of spatial filtering velocimetry

Figure 1 shows a basic optical SFV system. The moving object moves along the x-axis with incident illumination oblique to the object surface. Then, light in the image plane is modulated by the spatial filter, which is usually a grating. The modulated light is detected by the photodetector; thus, a periodic electrical signal is acquired.

Fig. 1
figure 1

Basic optical velocimetry system

The signal \(g({x_r},{y_r})\) that is acquired by the PD is described by the light intensity distribution and the transmission rate of SF \(h(x,y)\):

$$g({x_r},{y_r})=\iint {f({x_r}-x,{y_r}-y)h(x,y){\text{d}}x{\text{d}}y,}$$
(1)

where \({x_r}={v_x}t+{c_1}\); \({y_r}={v_y}t+{c_2}\); both \({c_1}\) and \({c_2}\) are constant; \({v_x}\) and \({v_y}\) are the velocity of the image of the object along the x-axis and that along the y-axis, respectively; and \(t\) is the movement time. Given spatial frequencies \(\mu\) and \(\nu\), the spatial power spectral density function \({G_P}(\mu ,\nu )\) of \(g({x_r},{y_r})\) is derived as

$${G_P}(\mu ,\nu )={F_P}(\mu ,\nu ){H_p}(\mu ,\nu ).$$
(2)

Here, \({F_P}(\mu ,\nu )\) is the spatial power spectrum density function of the light intensity distribution\(f(x,y)\). \({H_P}(\mu ,\nu )\) is the spatial power spectrum density function of \(h(x,y)\).

If the moving image is small enough relative to the period of SF, \({F_P}(\mu ,\nu )\) can be considered constant, and given that \(h(x,y)\) is constant along the y-axis, \({H_p}(\mu ,\nu )\) can be written as (3)

$${H_P}\left( {\mu ,\nu } \right)={H_P}\left( {\mu ,0} \right)\delta (\mu ,0),$$
(3)

where \({H_P}(\mu ,\nu )\) is nearly impulsive along the y-axis because of the uniformity of the transmission function along the y-axis. If we assume the periodic transmission function of SF is rectangular, the SF can be regarded as a narrow band filter where the fundamental frequency and its harmonics are located at \(\mu =\frac{k}{p}{\text{ }}(k=1,3,5...)\), as shown in Fig. 2. Since other minor peaks are due to the aperture of SF and are negligible, the narrow bandpass SF allows only the offset component located at \((\mu =0,{\text{ }}v=0)\), the fundamental harmonic \(({\mu _0}=\frac{1}{p},{\text{ }}v=0)\) and its harmonics to pass through.

Fig. 2
figure 2

Distribution of the energy spectrum function \({F_P}(\mu ,0)\) and \({H_p}(\mu ,0)\) of the SF in Fig. 1

Given [13],

$$\frac{f}{{{V_{{\text{image}}}}}}=\mu.$$
(4)

Here, f is the frequency of the acquired periodic signal and \({V_{{\text{image}}}}\) is the velocity of the moving image. Equation (4) implies that the signal frequency is proportional to the velocity of the image and that multiple spatial frequency components of the SF will cause multiple signal frequency peaks.

Thus, Eq. (2) can be written as

$${G_P}\left( f \right)=\frac{1}{{{V_{_{{{\text{image}}}}}}}}\mathop \int \limits_{{ - \infty }}^{\infty } {F_P}\left( {\frac{f}{{{V_{_{{{\text{image}}}}}}}},v} \right){H_P}\left( {\frac{f}{{{V_{_{{{\text{image}}}}}}}},v} \right){\text{d}}v.$$
(5)

To make use of Eq. (5), three preconditions are needed, and they are all satisfied when the image is small enough and the SF is an RTSF:

  1. 1.

    \({F_P}\left( {{\mu _0},0} \right)\) is larger than \({F_P}\left( {k{\mu _0},0} \right)\) \((k=3,5,7 \ldots ).\)

  2. 2.

    \({H_P}\left( {\frac{f}{{{V_{_{{{\text{image}}}}}}}},v} \right)\) is unimodal along the entire y-axis.

  3. 3.

    \({H_P}\left( {{\mu _0},0} \right)\) is larger than \({H_P}\left( {k{\mu _0},0} \right)\;\;(k=3, 5, 7, \ldots).\)

Here, \({\mu _0}\) is the fundamental frequency component. When these preconditions are met, \({G_P}\left( f \right)\) contains an offset component and periodic signal components that have a frequency of \(f=k{\mu _0}{V_{\text{image}}}\,(k=1,3,5,7 \ldots ).\)

After eliminating the offset component, the fundamental frequency that corresponds to the spatial frequency \(\mu ={\mu _0}\) is strongest and contains information on the velocity. Therefore, the velocity of the moving object, \({v_0}\), can be calculated as [13]

$${v_0}=\frac{p}{M}{f_0},$$
(6)

where \(p\) is the period of SF along the x-axis, \(M\)is the optical magnification and \({f_0}\) is the frequency of the fundamental frequency.

In practice, the first precondition is seldom satisfied, as shown in Fig. 2, especially when the image consists of patterns. Figure 3 shows the spatial frequency spectrum of a moving image of the measured object in blue and the spatial frequency spectrum of an ideal RTSF and STSF in orange. The multiple frequency peaks in the spectrum of the image are derived from the bands in the acquired image, which are shown as black and white rectangles in Fig. 6a.

Fig. 3
figure 3

Frequency spectrum of the moving object and of a rectangular (a) and sinusoidal (b) transmission function. The first order of frequency peaks of an RTSF may correspond to a low spatial frequency intensity of a moving object

Figure 3a implies that the fundamental harmonics may not be the strongest because although \({H_P}({\mu _0},{\text{0}})\) is larger than \({H_P}(k{\mu _0},0)\), \({F_P}({\mu _0},0)\) might be even smaller than \({F_P}(k{\mu _0},0)\) when k = 3, 5, 7…; therefore, the unnecessary higher harmonic would be misjudged as fundamental, causing error.

It appears that if the pattern of the object is definite, this problem can be solved by choosing an optimal \({\mu _0}\). However, additional noise caused by ambient light or CMOS as well as distortion will also lead to unexpected behavior of the power spectrum, which suggests that the optimal \({\mu _0}\) for a certain time may be nonoptimal for another.

To solve this problem, an STSF is needed. The power spectrum of an STSF is strictly unimodal along the positive x-axis and along the negative x-axis, as shown in Fig. 3b, and, thus, multiple signal frequency components will not be produced in the spectrum of SF.

3 Principle of signal analysis based on linear CMOS

Traditionally, transmission gratings for SFV are hardware components and, therefore, they are inflexible. However, images acquired electronically by camera can be applied to the same process in a computer using masks. Such software masks can be generated with various shapes and frequencies.

In Fig. 4, two masks (RTSF) are illustrated on the left. The masks weight the individual pixel values in the image with a value of 1 for a white mask position and with a value of 0 for a black mask position. All the weighted pixels within an image are summed for each mask. The resulting signal will have a sinusoidal characteristic as a function of elapsed time. Subtraction of the two sums provides a differential signal defined as g(t) on the right of Fig. 4 with any offset suppressed ideally [14]. Therefore, the offset component is eliminated—to a first approximation. This situation is similar to using a grating, in which one photodetector acquires the transmitted light, while a second photodetector acquires the light reflected from the grating. Technically, it is easy to combine two masks into one weighted with values of 1 and − 1, as discussed later. The cycle p of an SF is expressed as

Fig. 4
figure 4

Analog differential processing

$$p=ka,$$
(7)

where a is the pixel size and k is the period of SF in pixels.

To minimize both systematic and random errors, the number of gratings must be maximized [12]. This condition is rather easy to achieve because the number of pixels in a linear CMOS can be rather large. Moreover, a smaller pixel pitch can enable flexible adjustment of the period of the simulated SF. Finally, according to Nyquist’s theorem, the maximum line scan rate should be greater than twice the maximum signal frequency that can be derived by Eq. (6).

Further signal processing involves two parts: spectrum analysis and spectrum rectification. Many algorithms, such as the phase unwrapping methods [15], can be used to estimate the frequency of the signal, but the FFT algorithm is a better way to analyze the spectrum in great detail. The FFT method can extract the desired signal from strong noise. However, if we take the frequency of the peak in the frequency spectrum as the frequency of the detected signal, the deviation between the true signal frequency and the peak frequency is rather large. To rectify the deviation, the energy centrobaric correction method (ECCM) for a discrete spectrum is adopted [16].

FFT is applied to a discrete signal with N elements. The normalized frequency deviation δ is given as

$$\delta =\frac{{\sum\nolimits_{{p= - n}}^{n} {G\left( {m+p} \right)p} }}{{\sum\nolimits_{{p= - n}}^{n} {G\left( {m+p} \right)} }},$$
(8)

where \(G\left( k \right)\) is the power spectrum, m is the central index of the frequency peak, and n is the number of elements to be considered as a signal instead of noise among the N elements to be analyzed with FFT in each transformation. Given the sampling frequency \({f_s}\), the signal frequency \({f_0}\) can be determined as follows:

$${f_0}=\frac{{\left( {m+\delta } \right){f_s}}}{N}.$$
(9)

4 Experiments

Figure 5a shows the experimental optoelectronic system. The object distance is 2.38 m, the focal length is 50 mm, and the optical magnification is M = 0.021. The linear CMOS, which is placed parallel to the translating track, contains 1024 pixels, and the dimension of each pixel is 14 µm × 14 µm. The row transfer frequency is set to 30 kHz, and the aperture is f/2.8.

Fig. 5
figure 5

a The experimental optoelectronic system b and the circuit breaker

The method of illumination is dark-ground illumination, in which we use two 12 W LEDs. The angle from the central illuminating ray to the z-axis is approximately 40°.

As shown in Fig. 5b, the translating object, which is a metal bar controlled by an automatic mechanical device in the circuit breaker, moves between the two vertical blocks. A black and white card is attached to the bar to help focus and improve the SNR because the reflectivity of the card is greater than that of the bronze matte surface, whereas the pattern in the card would cause oscillation in the spatial frequency spectrum of the image, which may lead to instability in the velocity measurement. Luckily, this instability can be addressed using the approach discussed later. The maximum velocity of the moving bar is approximately 2.8 m/s. The acceleration of the moving bar can reach up to 20 m/s2, which means that the entire moving process can be completed within approximately 0.3 s. Such a high acceleration requires a proper algorithm to address it.

In this experiment, the reference velocity is provided by a linear encoder (MII6800, Celera Motion), which offers a 1% velocity error.

4.1 A: using RTSF

Figure 6a shows the image acquired from the linear CMOS. The vertical index of pixels is relevant to time, and the horizontal index of pixels is the sequence of pixels of the linear CMOS. This image represents the tracking of the translation object, and the trend exhibited by the black and white lines indicates that the translation object is moving toward the right. The acquired data are saved as a matrix Amn in a computer. As shown in Eq. (10), column vector Bn1 is a digital mask, which is the simulation of transmission SF, as explained in Sect. 3. An example of Bn1 is [1, 1, 1, 1, − 1, − 1, − 1, − 1, 1…], which represents an RTSF.

Fig. 6
figure 6

a Acquired image and b the derived signal \({C_{m1}}\) of a part of the acquired image

$${C_{m1}}={A_{mn}}{B_{n1}}.$$
(10)

Figure 6b exhibits the summed grayscale \({C_{m1}}\), which is g(t) in Fig. 4. The frequency chirp indicates that the object is accelerating toward higher velocity.

In this experiment, the period of the SF in vector B corresponds to k = 8 (pixels). Figure 6b shows that the offset component eliminated signal \({C_{m1}}\), which represents the relative signal intensity from the spatial filter with respect to time. By applying FFT to signal \({C_{m1}}\), we acquire the frequency spectra of signal \({C_{m1}}\).

As shown in Fig. 6b, the frequency of the signal will change quickly because the acceleration of the object reaches up to 20 m/s2. Therefore, both an excessively small and an excessively large frequency estimating range N, which is the number of elements of the signal to be analyzed with FFT at one time, would lead to an inaccurate frequency. In this experiment, N = 512 or N = 1024 is the suitable configuration.

Applying FFT to the signal derived from the images applied with an RTSF, four typical frequency spectra are obtained as shown in Fig. 7. Theoretically, the maximum peak is the desired signal, and the frequency of this peak can be converted to velocity; however, noise will hinder the analysis.

Fig. 7
figure 7

a Acceptable frequency spectrum, b noisy spectrum, c loss of expected signal with triple frequency component salient, and d loss of expected signal

In this paper, the signal is defined as the desired frequency peak based on prior knowledge of the current velocity, shown as the peak inside the box in Fig. 7a, of which the width is empirically set to five elements; and any other frequency components, including unwanted harmonic peaks and background noise, are defined as noise. The SNR is calculated by dividing the integrated power of the signal in one spectrum by the integrated power of noise in the same spectrum.

Compared with that in Fig. 7a, the SNRs in Fig. 7b, c, d are clearly lower. Figure 7a shows an acceptable frequency spectrum with SNR = 30.25. Although there is a triple frequency component, the influence on the signal frequency estimation can be ignored. In Fig. 7b, the noise intensity is similar, but the intensity of the signal is much weaker than that in Fig. 7a; thus, the SNR is lower, which will contribute to a larger velocity error. In Fig. 7c, the desired signal is nearly lost, and the triple frequency component is salient compared with the desired signal, which will lead to a triple velocity error. In Fig. 7d, the low-frequency component is dominant, and the desired signal is completely lost; thus, the low-frequency component causes a large deviation from the correct value in the velocity curve.

Every spectrum displayed in Fig. 7 corresponds to a velocity measurement feature in the curves in Fig. 8. Figure 8 shows curves of different moving processes with different accuracies. In Fig. 8a, the measured maximum velocity of the object is Vmax = 1.229 m/s, and the reference velocity is 1.34 m/s. The error is approximately 8.3%, which appears to be unsatisfactory. The other spectra in Fig. 8 show certain velocity values that are erroneous, and valid values have been replaced with inaccurate or wrong values because undesirable behavior in the frequency spectra in Fig. 8b exhibits an inaccurate velocity measurement, represented by a ragged curve. Figure 8c shows the triple velocity error, which occurs at one or several points; the velocity reaches three times the correct value and then returns to normal. This error occurs because the third harmonic is dominant in this region. Figure 8d shows that the velocity would suddenly deviate to a value approaching zero and then return to normal. This error occurs because the low-frequency component is dominant in this region. We conclude that the errors present in Fig. 8 are due to either a substantial amount of background noise, a stochastic distribution of power in the fundamental peak and its harmonics or a combination of the two.

Fig. 8
figure 8

Velocity curve of four different moving processes: a accurate velocity curve, b inaccurate velocity curve, c velocity curve with a triple velocity error, and d loss of velocity in the process

The reason for these errors was explained previously with respect to Fig. 3. In practice, the power spectrum of a given SF is fixed in time, while the power spectrum of the object will vary due to any time-dependent changes in the images. For that reason, the reliability of the method can be improved by eliminating the contribution from the higher harmonics in the SF and any other background noise.

To eliminate these errors, an STSF is introduced. The power spectrum of the STSF is plotted in Fig. 3b. Compared with an RTSF, no higher harmonics are present in the STSF and, therefore, the SNR of the spectra using STSF is increased.

4.2 B: using STSF

Applying an STSF to the image acquired previously as in Fig. 6a, a typical signal curve is obtained as shown in Fig. 9a; its frequency spectrum is shown in Fig. 9b. This frequency spectrum is unimodal along the positive x-axis, and the higher-frequency components are all suppressed, which will dramatically reduce the error in the frequency estimation.

Fig. 9
figure 9

a A typical signal curve after using STSF and b the corresponding frequency spectrum

To compare the effect of the SFST with that of the RTSF, the SNRs of frequency spectra of the same experiment using the RTSF and the STSF with different values of k are illustrated for comparison in Figs. 10 and 11. The horizontal axis is the index of the frequency spectrum. To explain the index further, assuming the signal is composed of ABCDEFG…, the first spectrum describes signal segment ABCD, the second spectrum describes signal segment BCDE, and the third spectrum describes signal segment CDEF. Comparing the SNR of each spectrum is meaningless. One of the reasons is that \({F_P}({\mu _0},0)\) is stochastic. Therefore, the mean value of the SNR is more meaningful. Based on the comparison of the mean SNRs, we can draw three conclusions:

Fig. 10
figure 10

SNRs of the frequency spectra using RTSFs with different k values

Fig. 11
figure 11

SNRs of the frequency spectra for the same acquired images using STSFs with different k values

  1. 1.

    As k improves, the mean SNR improves.

  2. 2.

    The mean SNR is substantially improved by using STSF with the same k as with the RTSF. When k = 8, the mean SNR improvement is greater than 10 dB in this period of the moving process.

  3. 3.

    The mean SNRs of the spectra with k = 4 in Figs. 10 and 11 are identical because four elements are not sufficient to simulate an STSF.

However, according to Eqs. (6) and (7), the signal frequency is inversely proportional to k; therefore, increasing k would decrease the signal frequency and further decrease the accuracy in the estimation.

Specifically, it is difficult to accurately estimate the frequency (i.e., the velocity) when the frequency is so slow that it approaches the frequency resolution of the FFT. Empirically, when the velocity is less than three times the velocity resolution, the error of the calculated frequency is rather large even if the rectification algorithm is adopted. However, when the velocity is greater than that value, FFT in conjunction with ECCM works well in analyzing the frequency of the signal.

Given N = 1024, to ensure that the signal frequency of velocities greater than 0.5 m/s (the main focus of our experiment) is more than three times the frequency resolution and to yield a higher SNR, k = 8 was assigned as the optimal configuration in this experiment. The calculation is displayed in Table 1.

Table 1 The frequency calculation

The typical velocity curve using an STSF with k = 8 is plotted in Fig. 12, and large deviations in the velocity curve no longer occur. Table 2 shows some of the velocity values measured with STSF and their relative errors, which were within approximately ± 3%, and meet the proposed requirement.

Fig. 12
figure 12

Typical velocity curve obtained using STSF

Table 2 Measured velocity values used with STSF and their relative errors

The time resolution of this SFV sensor with an STSF can reach up to 33.3 µs, which can be regarded as a continuous measurement.

5 Conclusion

SFV with a CMOS can achieve high accuracy in velocity measurements. The error in SFV using an RTSF is due to the low SNR signal caused by the nonideal power spectrum of the image, the nonideal optical system and electrical noise. The STSF can eliminate the undesired frequency component effectively such that continuous measurement can be achieved. Experiments on a high-voltage circuit breaker using an STSF show that the errors in the velocity measurements are approximately ± 3% and that the time resolution can reach up to 33.3 µs with a frame rate of 30 kHz. The optimal grating period in this experiment is 8 pixels.