Introduction

Beamforming is an indispensable component in pulse-echo medical ultrasound imaging. Delay-and-sum (DAS) beamforming is the most popular method used in clinical situations. The DAS beamformer compensates for time delays in ultrasonic echoes received by individual transducer elements by estimating the distance between the receiving focal point and each transducer element in the ultrasonic probe. To identify this distance, the speed of sound in the medium (tissue) needs to be assumed. However, the spatial distribution of the speed of sound in tissue is inhomogeneous and varies depending on the tissue type. Such a mismatch between the assumed and actual speed of sound leads to focusing errors, namely, phase aberration [1,2,3].

Ultrasound computed tomography (USCT) was developed for tissue characterization based on the acoustic properties of biological tissues [4,5,6,7,8]. USCT can estimate the spatial distribution of the speed of sound in tissue by measuring ultrasound transmitted through tissues from different angles. USCT is an established method of measuring the spatial distribution of the speed of sound, but an ultrasound wave needs to be transmitted through the tissue. Therefore, the application of USCT is limited to cases such as breast imaging, compared with pulse-echo ultrasound imaging.

Methods for determining the speed of sound based on pulse-echo measurements have also been developed. The speed of sound can be estimated using the pulse-echo mode by measuring the round-trip time of flight of an ultrasonic wave and the distance (thickness) traveled. Ophir and Yazdi [9] developed a method for estimating the speed of sound by measuring the change in time of flight caused by a known tissue deformation induced by external compression. Recently, Nitta et al. [10] developed a method in which the thickness of tissue was measured by magnetic resonance imaging (MRI). Another approach is based on the pulse-echo time of flight profiles measured at two different positions using a hydrophone [11] or multiple positions using a linear array transducer [12].

Recently, Imbault et al. [13] developed a method using the spatial coherence evaluated from ultrasonic echo signals received by individual transducer elements. They calculated the autocorrelation function of the element echo signals at multiple lags [14] and estimated the speed of sound using a characteristic of the autocorrelation function, i.e., the number of lags generating high autocorrelation values reached a maximum when the assumed speed of sound coincided with the true speed of sound. The autocorrelation function of element echo signals is also used in pulse-echo medical ultrasonic imaging as the short-lag spatial coherence method [15, 16].

Yoon et al. [19] and Cho et al. [20] reported that the speed of sound can be estimated using the coherence factor (CF) [17, 18] evaluated from echoes from a strong point scatterer.

Focusing quality in DAS beamforming can be measured using the CF by evaluating the ratio of the coherent sum to the incoherent sum of element echo signals. Undesired components can be suppressed by weighting the beamformed radio frequency (RF) signals with the CF because the CF becomes low when there is a large focusing error. A mismatch between the assumed and true speed of sound leads to a focusing error, resulting in a decrease in the CF. Therefore, the speed of sound can be estimated as the assumed speed of sound, which maximizes the CF.

In the above-mentioned studies [12, 13, 19, 20], the speed of sound was estimated using a strong point scatter in the medium. A distinct scatter in the medium was used in [12, 19, 20], and a point scatter was created in a diffuse scattering medium using an iterative beamforming method [21]. In our series of studies on improving the performance of ultrasound beamforming [22,23,24], we found that the CFs evaluated using the echoes from a diffuse scattering medium, which did not contain any distinct scatterers, also depend on the assumed speed of sound [25]. In the present study, the accuracy of our proposed method for estimating the speed of sound [25] was improved by properly considering the propagation time delay of the transmitted ultrasonic pulse. Also, the proposed method was implemented in both conventional line-by-line acquisition with a focused transmit beam and plane wave imaging to investigate the influence of the difference in ultrasonic fields. The method proposed in the present study does not require a distinct point target and can be applied to a diffuse scattering medium without any specific transmit–receive sequences, such as an iterative approach. The feasibility of the proposed method was validated using a phantom with a diffuse scattering medium.

Materials and methods

Beamforming

In the present study, beamformed RF signals s(m, n) were obtained based on DAS beamforming, where m is the lateral sampling number (m = 0, 1, 2,, M −1) and n is the axial sampling number (n = 0, 1, 2,…, N − 1).

To focus with respect to the same ultrasonic echo at different assumed speeds of sound cl (l = 0, 1, 2,…, L −1), the traveling time τtr(m, n) of an emitted ultrasonic pulse from the transmit aperture to the focal point (m, n) is calculated as follows:

$$\tau_{\text{tr}} \left( {m, n} \right) = \frac{n}{{f_{s} }} + \tau_{c} ,$$
(1)

where fs is the sampling frequency of received signals, and τc is a correction term corresponding to the time delay applied to the central element in the aperture.

To evaluate the change in CF due to the difference in the assumed speeds of sound cl, the CF obtained from the same echo needs to be analyzed. Using Eq. (1), the echo signal at the n-th sampled point can be analyzed under any assumed speed of sound cl.

In plane wave imaging, all elements in the transmit aperture are driven at the same time (t = 0). On the other hand, in the line-by-line sequence using a focused transmit beam, the central element in the aperture is driven with a delay relative to the elements at the edges of the aperture. The correction term τc in Eq. (1) corresponds to this time delay between the central and edge elements. In the present study, the correction term τc was set at 1.39 µs in the line-by-line sequence using a focused transmit beam. In the plane wave imaging, τc was set at zero.

Then, the traveling time τrc,k (m, n) of an ultrasonic echo from the focal point to the k-th element (k = 0, 1, 2,…, K − 1) is calculated as follows:

$$\tau_{{{\text{rc}}, k}} \left( {m,n} \right) = \frac{{\sqrt {\left( {c_{l} \cdot \frac{n}{{f_{s} }}} \right)^{2} + \left( {d \cdot k - 0.5\left( {K - 1} \right)} \right)} }}{{C_{l} }},$$
(2)

where d is the element pitch of a transducer array.

The total traveling time τk (m, n) was obtained as [τtr(m, n) + τrc,k(m, n)] and applied to the k-th element signal. The delayed element signals were summed for all K elements to obtain the DAS beamformer output sDAS(m, n).

Coherence factor (CF)

Figure 1 illustrates DAS beamforming. As shown in Fig. 1a, ultrasonic echo signals received by individual elements are aligned after delay compensation when the focal point coincides exactly with the position of a scatterer. In such a case, the summed signal is enhanced. On the other hand, the summed signal is suppressed when the focal point does not coincide with the position of a scatterer because the element signals are not aligned, as shown in Fig. 1b. The CF evaluates such a focusing error using element echo signals.

Fig. 1
figure 1

Illustrations of ultrasonic echo signals received by individual transducer elements. Time delays have already been applied to the signals by the DAS beamformer. a No focusing errors. b With focusing errors

Let us define the ultrasonic echo signal received by the k-th transducer element by ek (m, n), where the time delay τk (m, n) has already been applied to ek (m, n). The CF CF (m, n) at a spatial point (m, n) in the imaged region is obtained as follows [17, 18]:

$$CF\left( {m,n} \right) = \frac{{\left| {\mathop \sum \nolimits_{k = 0}^{K - 1} e_{k} \left( {m,n} \right)} \right|^{2} }}{{K\mathop \sum \nolimits_{k = 0}^{K - 1} \left| {e_{k} \left( {m,n} \right)} \right|^{2} }}.$$
(3)

The output sCF(m, n) of the DAS beamformer weighted with the CF is obtained as follows:

$$s_{\text{CF}} (m,n) = CF(m,n)\cdot \, s_{\text{DAS}} (m,n)$$
(4)

Estimation of speed of sound

In the present study, the CFs CFl (m, n) were obtained under L different assumed speeds of sound. In the experimental validation, the assumed speed of sound was changed from 1480 to 1600 m/s using a pitch of 5 m/s. The speed of sound in the medium is estimated as the assumed speed of sound, which gives the maximum CF averaged in the region of interest (ROI). The averaged CF C̅Fl at the assumed speed of sound cl (l = 0, 1, 2,, L −1) is obtained as follows:

$$CF_{l} = \frac{1}{{K_{\text{ROI}} }}\mathop \sum \limits_{m,n < R} CF_{l} \left( {m,n} \right),$$
(5)

where R denotes the ROI, and KROI is the number of CF values in the ROI. In the present study, the lateral ROI size was fixed to the entire imaging field of view. The influence of the axial ROI size on estimating the speed of sound was investigated.

In estimating the averaged CF defined by Eq. (5), the CFs obtained from echo signals included those with low signal-to-noise ratios (SNRs). To exclude such CFs with low SNRs, the maximum CFs CFmax(m, n) and the average amplitude of beamformed signals \(\left| s \right|\)(m, n) were defined as follows:

$$CF_{\rm{max}} (m,n) = \max_{l} CF_{l} \left( {m, n} \right),$$
(6)
$$\left| s \right|\left( {m,n} \right) = \frac{1}{L}\mathop \sum \limits_{l = 0}^{L - 1} \left| {s_{CF,l} \left( {m,n} \right)} \right|.$$
(7)

In estimating the speed of sound, CFs CFl(m, n) obtained at point (m, n), where CFmax(m, n) or \(\left| s \right|\) (m, n) was lower than the assigned threshold, was excluded from the calculation of the average CF C̅Fl.

Experimental setup

An ultrasound imaging phantom (model 040GSE, CIRS) was used for validation of the proposed method. The homogeneous diffuse scattering medium contained in the phantom was measured, and the speed of sound in the medium was 1540 m/s.

A linear array ultrasonic probe at a nominal center frequency of 7.5 MHz (UST-5412, Hitachi) was used for the acquisition of ultrasonic echo signals. The element pitch of the linear array was 0.2 mm. Ultrasonic echo signals received by individual transducer elements were acquired by a custom-made ultrasound scanner (RSYS0002, Microsonic) at a sampling frequency of 31.25 MHz.

In the present study, a conventional line-by-line sequence using a focused transmit beam and ultrafast plane wave imaging were used to investigate the influence of transmit beam patterns on estimating the speed of sound.

In the conventional line-by-line sequence, a transmit beam focused at a depth of 20 mm was produced by 72 elements with rectangular apodization, and scattered echoes were received by 96 elements. The centers of the transmit and receive apertures were located at the same position. In each firing, a dynamically focused receiving beam was created using echo signals received by the 96 elements with rectangular apodization. The same procedure was repeated 121 times by changing the aperture position using a pitch of 0.2 mm.

The ultrafast plane wave imaging sequence described in [26] was used in the present study. A plane wave was emitted by 96 elements, and scattered echoes were received by the same 96 elements. A dynamically focused receiving beam was created using the echo signals received by 72 elements with rectangular apodization. Consequently, (96 −72) = 24 receiving beams with spacing of 0.2 mm could be created in each firing. The same procedure was repeated four times by changing the aperture position using a pitch of 24 ×0.2 = 4.8 mm. Such a sequence produced 96 receiving beams with a spacing of 0.2 mm in one image frame.

The receive beamforming and the estimation of the speed of sound on the ultrasound echo signals received by the individual elements were performed off-line using an in-house software program based on Matlab (MathWorks Inc., Natick, MA).

Experimental results

Figure 2a and b shows B-mode images of a phantom obtained by the conventional line-by-line sequence and plane wave imaging, respectively. The homogeneous diffuse scattering medium is visualized by both imaging methods. As shown in Fig. 2, in the region shallower than 5 mm, there is strong noise due to the high-voltage pulses applied to the transducer elements. There is also some complex interference between the ultrasonic waves emitted from the transducer elements. Therefore, ultrasound data from a region shallower than 12.24 mm (500 sampled points) were excluded in subsequent analyses.

Fig. 2
figure 2

B-mode images of phantom at an assumed speed of sound of 1540 m/s. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Figure 3a and b shows the average amplitudes of beamformed signals \(\left| s \right|\)(m, n) obtained by the line-by-line sequence and plane wave imaging, respectively. As can be seen in Fig. 3a, the average amplitudes are significantly large around a focal depth of 20 mm. Figure 4a and b shows the maximum CFs CFmax(m, n) obtained by the line-by-line sequence and plane wave imaging, respectively. As with the case of the average amplitudes of beamformed signals \(\left| s \right|\)(m, n) obtained by line-by-line sequence, the maximum CFs CFmax(m, n) are also significantly large around a focal distance of 20 mm.

Fig. 3
figure 3

Distribution of average amplitude of beamformed RF signals. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Fig. 4
figure 4

Distribution of maximum coherence factor. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Figure 5a and b shows the average CFs C̅Fl obtained by the line-by-line sequence and plane wave imaging, respectively, plotted as functions of the assigned speed of sound cl. In Fig. 5, all CF values in the ROI from a depth of 12.24–24.64 mm [axial width: 12.24 mm (500 sampled points)] are averaged to obtain C̅Fl without setting any threshold. The speed of sound was estimated from the peaks of the profiles in Fig. 5 with interpolation by a factor of 10 (resolution: 0.5 m/s). The speed of sound was estimated to be 1525.0 m/s (0.97% error) and 1604.5 m/s (4.19% error) by the line-by-line sequence and plane wave imaging, respectively. Errors in the estimated speed of sound were large, particularly in the plane wave imaging when a threshold was not set.

Fig. 5
figure 5

Average coherence factor obtained from all CF values in the depth range of 12.24–24.48 mm (500th–999th sampled points). a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Figure 6a and b shows the average CFs C̅Fl obtained by the line-by-line sequence and plane wave imaging, respectively, by averaging the CF values at spatial points where the maximum CFs CFmax(m, n) were larger than 20% of their maximum value. As can be seen in Fig. 6, the profiles show peaks even in the case of plane wave imaging. The speed of sound estimated by the line-by-line sequence and plane wave imaging was 1525.0 m/s (0.97% error) and 1545.5 m/s (0.36% error), respectively, and the estimation error with plane wave imaging was reduced.

Fig. 6
figure 6

Average coherence factor obtained with respect to the depth range of 12.24–24.48 mm (500th–999th sampled points). CF values at spatial points, where the maximum CF values are over 20% of their maximum value, were used to obtain the average coherence factor. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Figure 7a and b shows the average CFs C̅Fl obtained by the line-by-line sequence and plane wave imaging, respectively, by averaging the CF values at spatial points where the average amplitudes \(\left| s \right|\)(m, n) were larger than 20% of their maximum value. The peaks of the profiles can be recognized more clearly by setting thresholds for the CFs based on the amplitudes of the echo signals. The speed of sound estimated by the line-by-line sequence and plane wave imaging were 1528.5 m/s (0.75% error) and 1544.5 m/s (0.29% error), respectively. Particularly with the plane wave imaging, an accurate estimate of the speed of sound could be obtained by setting a threshold for the average amplitudes of beamformed signals \(\left| s \right|\)(m, n). More accurate estimates of the speed of sound could be obtained by setting this threshold because \(\left| s \right|\)(m, n) was obtained from DAS outputs weighted by CFs, i.e., they contain information on both the amplitude of echo signals and the CF values. Using \(\left| s \right|\)(m, n) provided more reliable CF values for estimating the speed of sound.

Fig. 7
figure 7

Average coherence factor obtained with respect to the depth range of 12.24–24.48 mm (500th–999th sampled points). CF values at spatial points, where the average amplitudes of beamformed RF signals are larger than 20% of the maximum value, were used to obtain the average coherence factor. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Figure 8 shows the profiles of the average CFs C̅Fl obtained with different threshold values applied to the average amplitudes of beamformed signals \(\left| s \right|\)(m, n). Figure 8a and b is obtained by the line-by-line sequence and plane wave imaging, respectively. The estimated speed of sound depends on the threshold value, but errors in the estimation were smallest when the threshold value was set at 0.2 and 0.8 in the line-by-line sequence and 0.2 in plane wave imaging. To obtain stable estimates of the speed of sound, it would be preferable to use a threshold value of 0.2 because a greater number of CF values could be used with a low threshold value. Figure 9 shows the number of CF values used for estimating the speed of sound. As can be seen in the figure, the number decreases significantly by increasing the threshold value.

Fig. 8
figure 8

Average coherence factor obtained with respect to the depth range of 12.24–24.48 mm (500th–999th sampled points). Different threshold values were applied to the average amplitudes of beamformed RF signals. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

Fig. 9
figure 9

Number of data used for estimating the speed of sound

The speed of sound was also estimated with smaller axial sizes of ROIs. The axial ROI size was set at 12.24 mm (500 sampled points), 6.16 mm (250 sampled points), 2.464 mm (100 sampled points), and 1.224 mm (50 sampled points). There is a possibility of sequentially estimating the axial distribution of the speed of sound from a shallow region to a deep region when the speed of sound can be estimated with a smaller ROI. Figure 10a and b shows the speed of sound estimated with the line-by-line sequence and plane wave imaging, respectively. In the line-by-line sequence, the estimated speed of sound was significantly dependent on the depth. The speed of sound estimated with a large axial ROI size of 12.24 mm was very close to that around a focal distance of 20 mm estimated with smaller axial ROI sizes. Echoes from a region around the focus were considered to contribute more to estimating the speed of sound because the amplitudes of beamformed signals around the focus were significantly larger than those in other regions.

Fig. 10
figure 10

Speed of sound estimated with different axial sizes of ROIs. a Line-by-line sequence with transmit beam focused at 20 mm. b Plane wave imaging

On the other hand, the estimates obtained by plane wave imaging were less dependent on the depth, but the deviation increased with a smaller axial ROI size. Accurate estimates could be obtained with axial ROI sizes of over 2.448 mm by plane wave imaging.

Discussion

In the present study, a method using the CF was developed for estimating the speed of sound in a medium. The proposed method does not require any point targets in the medium, as used in [12, 13]. Good estimates were obtained by the proposed method in experiments using a phantom with a homogeneous diffuse scattering medium.

The proposed method was examined using a line-by-line sequence with a focused transmit beam and plane wave imaging to investigate the influence of transmit ultrasonic fields on estimating the speed of sound. In Fig. 10a, an apparent trend can be seen in the speed of sound estimated with small ROIs, which is presumably caused by the focused ultrasonic field, but the deviation in the speed of sound estimated using a focused transmit beam is smaller than that estimated using a plane wave. It could be considered that the deviation in the line-by-line sequence with a focused transmit beam was small because the produced sound pressure was larger than that in the plane wave imaging, i.e., the SNRs of echo signals were better in the line-by-line sequence than in the plane wave imaging. The trend in the estimated speed of sound in Fig. 10a might be reduced by decreasing the size of the transmit aperture because the degree of focusing is reduced. On the other hand, the produced sound pressure will decrease with a small transmit aperture, and the SNRs of echo signals would be degraded. In our future work, such effects of the transmit aperture size on estimating the speed of sound with a focused transmit beam will be investigated.

Such optimization in the transmitted ultrasonic field is considered important due to the undesirable trend in the speed of sound estimated with focused beams as described above and the change in the CF obtained with plane waves is small, as shown in Fig. 7. In plane wave imaging, the CF is generally lower than that obtained with focused beams because echoes from the receiving focal point are contaminated by echoes from other spatial points. The reason for this is a wide transmit beam (plane wave) that illuminates not only a scatterer at the focal point but also those at other spatial points. The CF is also decreased by noise, as described above. By reducing the difference between the assumed and true speed of sound, the CF increases but the increment is smaller than that obtained with focused beams because echoes are still contaminated by out-of-focus echoes. A transmit beam with a stable wavefront, narrow beam width, and high sound pressure would be preferable for estimating the CF. However, it is difficult to realize such a transmit beam. A focused beam or plane wave transmitted using a small aperture or loosely focused beam (focal point is further than the maximum imaging depth) would be candidates for estimating the CF. In our future work, such optimization in the transmitted ultrasonic field will be conducted.

In plane wave imaging, the bias error in the estimated speed of sound seemed small, i.e., the estimate deviates around the true speed of sound. However, the deviation in the estimated speed of sound was large when the axial ROI size was small. The reason for this was considered that the SNRs of echo signals were lower than those in the line-by-line sequence with a focused beam, and the number of CF values used in estimating the speed of sound decreased dramatically by setting a threshold for the average amplitudes of beamformed signals, as shown in Fig. 9. A more homogeneous transmit sound field distribution might contribute to obtaining better estimates of the CF values. Only rectangular apodization could be used in the experimental system adopted for the present study, and the sound pressure distribution across a plane wave wavefront might show fluctuations in the case of rectangular apodization. The sound pressure distribution across a wavefront would be more homogeneous using Tukey apodization [27]. In our future work, such effects of transmit apodization will also be investigated.

Conclusion

In medical ultrasonic imaging based on the pulse-echo method, the speed of sound in the medium needs to be assumed. For more accurate beamforming, an initial study was conducted to develop a method for estimating the speed of sound in the medium based on the CF evaluated using ultrasonic echo signals received by individual transducer elements. In the validation experiments using a phantom with a homogeneous diffuse scattering medium, the speed of sound was estimated using the CFs selected by setting a threshold value for the average amplitudes of beamformed signals. The proposed method does not require any specific transmit–receive sequence and would be preferable for practical applications.