Introduction

Working conditions and field conditions may cause low signal-to-noise ratio (SNR) and low resolution. Many seismic denoising methods have been proposed, including median filter (Bednar 1983), fx deconvolution (Canales 1984), fk filter (Stewart and Schieck 1989) and curvelet thresholding (Neelamani et al. 2008). In 2012, non-local mean (NLM) filter was applied to land seismic random noise suppression (Bonar and Sacchi 2012; Shang et al. 2013) and achieved good noise suppression effect. However, in the complicated desert seismic records, conventional methods of seismic noise reduction including non-local mean filter have turned out to be ineffective. Therefore, the research on detecting the desert seismic events, improving the resolution of effective signals and reducing noise is of great significance for desert seismic exploration.

In order to solve the above problems, the dimension reduction theory and clustering theory in machine learning are introduced into desert seismic signal processing to compensate for the lack of non-local mean filter for desert seismic signals. Common methods for dimension reduction include principal component analysis (PCA) (Anderson 1963; Tipping and Bishop 2014), linear discriminant analysis (LDA) (Ye et al. 2006; Bandos et al. 2009; Yu and Yang 2001; Yang et al. 2005) and manifold learning (Meng et al. 2017). PCA is an unsupervised linear dimension reduction algorithm. Its purpose is to maximize the variance of projection sample points after projecting high-dimensional data into low-dimensional space. The sample points are scattered as much as possible. In desert seismic events, the seismic data in the low-dimensional space only realize the dispersion of sample points and cannot distinguish effective signal points from noise data points. In this respect, LDA is superior to PCA. LDA is a supervised linear dimension reduction. So, it can achieve data classification while reducing dimensions.

The LDA was developed from Fisher discriminant analysis which was first proposed by Fisher (1936) on the two classification problem (Fisher 1936). It assumes that the covariance matrices for all types of sample points are the same and full rank. Therefore, LDA can be used not only for two classifications, but also for multiple classifications. LDA obtains the optimal sample projection direction by training a known sample of a desert seismic signal training set. When this projection direction is used for new desert seismic data, effective signals and noise can be separated in a low-dimensional space. Then, the effective signal is extracted to locate the seismic events. Finally, the non-local means filter is used for noise removal. Experiments have shown that the results obtained by this method are better than directly using the non-local means filter or fx deconvolution.

Random noise reduction framework

This framework mainly includes two parts which are signal detection and filtering. In this paper, we select LDA as the signal detection method in this framework. LDA relies on the learning of the training sample set. The selection of the training set will directly determine whether the new seismic data can separate effective signals and noise when projected into the low-dimensional space. Therefore, the similarity in each feature between the synthetic seismic signals in the training set and the real desert seismic signals is extremely important.

Linear discriminant analysis

The idea of LDA is described as: Given the training set, the samples in the training set are projected onto a line in a certain way, so that the same type of projection data points are as close to each other as possible, and the heterogeneous projection data points are far away from each other. When a new data set is encountered, it is projected onto the same line and the classification of the new sample is obtained based on the position of the projected data points on the line.

For a given data set \(D = \{ ({\mathbf{x}}_{{\mathbf{i}}} ,y_{i} )\}_{i = 1}\), \(y_{i} \in \{ 0,1\}\), let \(X_{i}\), \({\varvec{\upmu}}_{{\mathbf{i}}}\) and \(\Sigma _{{\mathbf{i}}}\), respectively, be a set of examples, a mean vector and a covariance matrix. When all the sample data are projected onto a straight line \({\varvec{\upomega}}\), the projections of the centers of the two types of sample data on the straight line are \({\varvec{\upomega}}^{\rm T} {\varvec{\upmu}}_{{\mathbf{0}}}\) and \({\varvec{\upomega}}^{\rm T} {\varvec{\upmu}}_{{\mathbf{1}}}\). The covariances of the two types of samples are, respectively, \({\varvec{\upomega}}^{\rm T} \sum\nolimits_{0} {\varvec{\upomega}}\) and \({\varvec{\upomega}}^{\rm T} \sum\nolimits_{1} {\varvec{\upomega}}\). According to the basic idea of the LDA algorithm, the similar sample data points should be as close as possible and the heterogeneous sample data points should be far away from each other. Then, the covariance matrix of the same type of projection sample points \({\varvec{\upomega}}^{\rm T} \sum\nolimits_{0} {\varvec{\upomega}} + {\varvec{\upomega}}^{\rm T} \sum\nolimits_{1} {\varvec{\upomega}}\) should be as small as possible, and the distance of the projection center of the heterogeneous sample points \(\| {{\varvec{\upomega }}^{\rm T} {\varvec{\upmu }}_{{\mathbf{0}}} - {\varvec{\upomega }}^{\rm T} {\varvec{\upmu }}_{{\mathbf{1}}} } \|_{2}^{2}\) should be as far as possible, which can get the maximum goal as follow (Ye et al. 2006):

$$J = \frac{{\left\| {{\varvec{\upomega}}^{\rm T} {\varvec{\upmu}}_{{\mathbf{0}}} - {\varvec{\upomega}}^{\rm T} {\varvec{\upmu}}_{{\mathbf{1}}} } \right\|_{2}^{2} }}{{{\varvec{\upomega}}^{\rm T}\Sigma _{0} {\varvec{\upomega}} + {\varvec{\upomega}}^{\rm T}\Sigma _{1} {\varvec{\upomega}}}} = \frac{{{\varvec{\upomega}}^{\rm T} \left( {{\varvec{\upmu}}_{{\mathbf{0}}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)\left( {{\varvec{\upmu}}_{{\mathbf{0}}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)^{\rm T} {\varvec{\upomega}}}}{{{\varvec{\upomega}}^{\rm T} \left( {\sum_{0} + \sum_{1} } \right){\varvec{\upomega}}}}$$
(1)

“Within-class scatter matrix” is defined as (Fukunaga 1990):

$${\mathbf{S}}_{{\varvec{\upomega}}} = \sum_{0} + \sum_{1} = \sum\limits_{{{\mathbf{x}} \in X_{0} }} {\left( {{\mathbf{x}} - {\varvec{\upmu}}_{{\mathbf{0}}} } \right)\left( {{\mathbf{x}} - {\varvec{\upmu}}_{{\mathbf{0}}} } \right)^{\rm T} } + \sum\limits_{{{\mathbf{x}} \in X_{1} }} {\left( {{\mathbf{x}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)\left( {{\mathbf{x}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)^{\rm T} }$$
(2)

“Between-class scatter matrix” is defined as (Fukunaga 1990):

$${\mathbf{S}}_{{\mathbf{b}}} = \left( {{\varvec{\upmu}}_{{\mathbf{0}}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)\left( {{\varvec{\upmu}}_{{\mathbf{0}}} - {\varvec{\upmu}}_{{\mathbf{1}}} } \right)^{\rm T}$$
(3)

Then, Eq. (1) can be simplified as follow:

$$J = \frac{{{\varvec{\upomega}}^{\rm T} {\mathbf{S}}_{{\mathbf{b}}} {\varvec{\upomega}}}}{{{\varvec{\upomega}}^{\rm T} {\mathbf{S}}_{{\varvec{\upomega}}} {\varvec{\upomega}}}}$$
(4)

Equation (4) is the maximization goal of the LDA algorithm. That is the “generalized Rayleigh quotient” of \({\mathbf{S}}_{{\mathbf{b}}}\) and \({\mathbf{S}}_{{\varvec{\upomega}}}\).

In order to get a determinate \({\varvec{\upomega}}\), we must normalize the denominator of Eq. (4), and let \({\varvec{\upomega}}^{\rm T} {\mathbf{S}}_{{\varvec{\upomega}}} {\varvec{\upomega}} = 1\), then, Eq. (4) can be equivalent to:

$$\begin{aligned} & \mathop {\hbox{min} }\limits_{{\varvec{\upomega}}} \left( { - {\varvec{\upomega}}^{\rm T} {\mathbf{S}}_{{\mathbf{b}}} {\varvec{\upomega}}} \right) \\ & {\text{s}} . {\text{t}} .\;\left( {{\varvec{\upomega}}^{\rm T} {\mathbf{S}}_{{\varvec{\upomega}}} {\varvec{\upomega}} = 1} \right) \\ \end{aligned}$$
(5)

By Lagrange multiplier method, Eq. (5) can be equivalent to:

$${\mathbf{S}}_{{\mathbf{b}}} {\varvec{\upomega}} = \lambda {\mathbf{S}}_{{\varvec{\upomega}}} {\varvec{\upomega}}$$
(6)

where \(\lambda\) is the Lagrangian multiplier. It can be seen the feature vector of \({\mathbf{S}}_{{\varvec{\upomega}}}^{ - 1} {\mathbf{S}}_{{\mathbf{b}}}\).

For a new data set \({\mathbf{X}}\), its low-dimensional projection data set \({\mathbf{Y}}\) can be expressed as:

$${\mathbf{Y}} = {\varvec{\upomega}}^{\rm T} {\mathbf{X}}$$
(7)

Training set

LDA needs a corresponding training set. However, there is no mature available training set for desert seismic signals. Therefore, an appropriate training set needs to be generated to process the real desert seismic signal.

To generate the training set, it is necessary to understand the characteristics of sample data. In the denoising process of seismic prospecting, random noise is often assumed to be stationary and Gaussian. According to the statistical analysis, the desert seismic random noise is not strictly stationary, but locally stationary. Moreover, we can get that the desert random noise is non-Gaussian through the Gaussian property test (Zhong et al. 2015a, b). In terms of linearity, Zhong et al. (2015a, b) proved that desert seismic random noise is nonlinear.

In general seismic signal inversion process, Ricker waves are used to synthesize seismic signals. Taking into account the complexity of desert seismic signals and its generalization performance, general zero-phase waves and mixed-phase waves are added to the training set. Previous studies of the desert seismic signals have shown that the frequency of their effective signals is around 30 Hz. The frequency range of noise is from several Hertz to twenty Hertz (Li and Li 2016; Li et al. 2017). In order to ensure the richness of frequency components, there are a total of 11 frequency components of seismic wavelets in the training set, including 20 Hz, 22 Hz, 24 Hz, 26 Hz, 28 Hz, 30 Hz, 32 Hz, 34 Hz, 36 Hz, 38 Hz and 40 Hz. In the desert seismic signal training set, the synthesize desert noise is used in the noise part (Li and Li 2016; Li et al. 2017). Its characteristics are very similar to the real desert noise.

The general formula for constructing Ricker waves as:

$$f(t) = A\left[ {1 - 2 \times \left( {\pi f_{0} t} \right)^{2} } \right] \times {\text{e}}^{{ - \left( {\pi f_{0} t} \right)^{2} }}$$
(8)

The general formula for constructing zero-phase waves as:

$$f(t) = A\cos \left[ {2\pi f_{0} \left( {t - t_{0} } \right)} \right] \times {\text{e}}^{{ - \pi^{2} f_{0}^{2} \left( {t - t_{0} } \right)^{2} }}$$
(9)

The general formula for constructing mixed-phase waves as:

$$f(t) = A\sin \left[ {2\pi f_{0} \left( {t - t_{0} } \right)} \right] \times {\text{e}}^{{ - \pi^{2} f_{0}^{2} \left( {t - t_{0} } \right)^{2} }}$$
(10)

where \(A\) is the amplitude, \(f_{0}\) is the dominant frequency and \(t_{0}\) is the time delay.

The LDA training set requires the noisy signal and the corresponding position noise as two types of training data. Therefore, Ricker waves noised by synthetic desert random noise and synthetic random desert noise are used as two kinds of sample data in the training set. Sample data of general zero-phase waves and mixed-phase waves and corresponding frequency components are also obtained in this way. Every frequency component of each phase wave extracted 50 traces. Finally, 1650 noisy signal samples and corresponding 1650 noise samples are obtained, which together constitute a training set for training to learn the best projection direction \({\varvec{\upomega}}\).

Desert seismic random noise reduction

We set a desert seismic record \(X = \{ x_{ij} \}\), where \(i = 1 \cdots N\) is data point and \(j = 1 \cdots D\) is trace number. The first part: LDA effective signal detection. The projection direction \({\varvec{\upomega}}\) can be obtained by training sample set. Then, it is compared to a sliding window, which slides from top to bottom to reduce dimension of new desert seismic data, and obtaining low-dimensional projection data points. At this time, the low-dimensional data points have been divided into two categories labeled by the K-means clustering algorithm when \(k = 2\) (Hartigan and Wong 1979). The effective signal is reserved for detection. The second part: The filter is used to get the denoising result. By above description, we choose the sliding window method to reduce the dimension. The advantage of this method is that it can fully guarantee the relationship between data points, so that the data in the low-dimensional space still maintain the original relationship, and further ensure the accuracy of clustering. The computational cost of the method is dominated by training the cost of the projection direction \({\varvec{\upomega}}\) and clustering. The additional complexity associated with dimension reduction and the computation required to extract effective signal data points is negligible. Because of the small amount of sample data, the computational time is only a few minutes.

The noise reduction steps from desert seismic data based on LDA effective signal detection are given as follows:

Set a desert seismic record \(X = \left\{ {x_{ij} } \right\}\), where \(i\) is data point and \(j\) is trace number.

  1. 1.

    Generate the training set to get the within-class scatter matrix, between-class scatter matrix and the mean values of the two sample data points.

  2. 2.

    According to Eq. (6), we get the best projection direction \({\varvec{\upomega}}\); compare it to a window; and take the length of the window as 40 points.

  3. 3.

    From Eq. (7), the single-trace desert seismic record is processed. The sliding window moves downward by one point to reduce dimension. In order to avoid losing data points in the process of window sliding dimension reduction, we add zero to the beginning and ending of the original data.

  4. 4.

    Cluster the data of low dimensional by the K-means clustering algorithm to get the noise points and signal points; extract the signal data points to achieve effective signal detection.

  5. 5.

    The non-local mean filter is used to get the denoising results.

Experiments and results

Synthetic desert seismic record

Aiming to test the feasibility of this method, we apply it to a synthetic desert seismic record (Fig. 1) which has 50 traces and each trace has 1400 data points with dominant frequency of 30 Hz and 35 Hz. The amplitude of signal is 1 and the sampling frequency is 500 Hz, as shown Fig. 1a. We add synthetic desert seismic noise to this record and make the SNR = − 8.0619 dB. It is shown in Fig. 1b. The SNR is defined as follows (Meng et al. 2017):

$${\text{SNR}} = 10\log \frac{{\sum\nolimits_{i} {\sum\nolimits_{t} {\left| {s\left( {i,t} \right)} \right|^{2} } } }}{{\sum\nolimits_{i} {\sum\nolimits_{t} {\left| {x\left( {i,t} \right) - s\left( {i,t} \right)} \right|^{2} } } }}$$
(11)

where \(s(i,t)\) is the clean synthetic desert seismic signal, \(x(i,t)\) is the noisy signal, \(i = 1 \cdots N\) is data point and \(t = 1 \cdots M\) is trace number. We replace the signal detection method of the denoising framework with PCA, and the non-local mean filter is selected as filtering part to form a comparative test. In addition, fx deconvolution and curvelet thresholding are used as contrast experiments. The results are shown in Fig. 2. Figure 3 shows the residual results of synthetic desert seismic record by using five methods. Figure 2a describes the processing results under the denoising framework proposed in this paper. Figure 2b shows the results of processing with a non-local mean filter. Figure 2c illustrates the results of a comparative test using PCA to detect the effective signal. The results of fx deconvolution and curvelet thresholding are shown as Fig. 2d, e, respectively. It can be seen that the result of the non-local mean filter is not satisfactory. In Fig. 3b, effective signals have residue and the random noise is hardly reduced. It also illustrates the shortcomings of non-local mean filtering in random noise reduction in desert seismic record. The results of fx deconvolution and curvelet thresholding are better. But, they are not the best results. We can see the seismic events clearly. And the amplitude of effective signals has attenuation, as shown in Fig. 3d, e. In the same framework, the detection method is changed to PCA. The random noise is reserved more. And the noise part and the effective signal part are distinguished by a higher error rate, which makes it difficult to achieve the desired denoising effect. In the denoising framework introduced in this paper, the noise part and effective signal part of every trace can be accurately clustered when LDA is to detect effective signal. The output SNR of above methods is listed in Table 1. Figure 4 shows frequency–wavenumber spectra (FK spectra) of Figs. 1a, b and 2a–e. Comparing with Fig. 4a and Fig. 4c, we can see that the FK spectrum of clean synthetic desert seismic record and synthetic desert seismic record after denoising is very similar, and the denoising effect is obvious. In Fig. 4d, e, g, the noise reduction is not obvious; low-frequency noise is not reduced. The effective signal is also partially lost. In Fig. 4f, the part of effective signals is not clear. Therefore, the method proposed in this paper has the best denoising effect.

Fig. 1
figure 1

Synthetic desert seismic record. a Clean synthetic desert seismic record. b Synthetic desert seismic records with SNR = − 8.0619 dB

Fig. 2
figure 2

Processing results of synthetic desert seismic record. a Proposed method results. b Non-local mean filter results. c Non-local mean filter results after PCA detection. dfx deconvolution results. e Curvelet thresholding results

Fig. 3
figure 3

Residual comparison of synthetic desert seismic record before and after denoising. a Removed noise using proposed method. b Removed noise using non-local mean filter. c Removed noise using non-local mean filter after PCA detection. d Removed noise using fx deconvolution. e Removed noise using curvelet thresholding

Table 1 Output SNR comparison of selected methods for denoising
Fig. 4
figure 4

FK spectra of synthetic desert seismic record. a FK spectra of clean synthetic desert seismic record. b FK spectra of synthetic desert seismic records. c FK spectra of proposed method results. d FK spectra of non-local mean filter results. e FK spectra of non-local mean filter results after PCA detection. f FK spectra of fx deconvolution results. g FK spectra of curvelet thresholding results

Figure 5 shows the result of single-trace processing. Figure 5a shows the clean synthetic record, synthetic noisy records, K-means clustering results, signal detection results and filtering results of 38th trace after signal detection using LDA, respectively. Figure 5b shows results after using PCA to detect signal. The results show that the accuracy of LDA in signal detection is much higher than that of PCA and also proves the rationality of selecting LDA as the detection method in this denoising framework. Figure 6 is the plots of the amplitude comparison of effective signals. Compared with the contents of the blue box, the effective signal amplitude keeps great under the denoising framework, and the part of the noise is removed completely (Fig. 6a). However, using the other four ways to denoise, their amplitude have some attenuation, and a lot of random noise is preserved (Fig. 6b–e). Although the valley of partial Ricker waves is incomplete, it is obvious that the effective signal after denoising remains better under this framework.

Fig. 5
figure 5

Clustering result of single-trace processing. a Clean synthetic record, synthetic noisy records, K-means clustering results, signal detection results and filtering results of 38th trace after signal detection using LDA, respectively. b Clean synthetic record, synthetic noisy records, K-means clustering results, signal detection results and filtering results of 38th trace after signal detection using PCA, respectively

Fig. 6
figure 6

Amplitude comparison of signals of 38th trace. a Proposed method results. b Non-local mean filter results. c Non-local mean filter results after PCA detection. dfx deconvolution results. e Curvelet thresholding results

It is known from the above description that the results of filtering after the detection are mainly dependent on the accuracy of the clustering after reducing the dimension of the data. To further prove the superiority of the method, the experiments, processing synthetic seismic records with SNR = 2.5583 dB, − 3.7059 dB, − 6.1335 dB, − 8.2854 dB, − 10.4649 dB and − 13.7081 dB, respectively, are repeated 200 times to calculate the accuracy of clustering. The results are shown as shown in Table 2. The accuracy of clustering is defined as follows:

$$\eta = \frac{m}{n} \times 100\%$$
(12)

where \(n\) is the point number of synthetic desert seismic record, \(m\) is the number of points accurately clustered. The clustering accuracy of the two detection methods and K-means methods decreases with the reduction in SNR. The accuracy rate of using the unsupervised PCA to detect is greatly influenced by the SNR. The accuracy rate can reach 96.19% when the SNR is high. When the SNR is low, the accuracy rate will also be greatly reduced. On the contrary, the accuracy rate of using the supervised LDA to detect is less affected. Although the accuracy rate of LDA is not very different with PCA at the high SNR, the accuracy rate of LDA is far greater than PCA at low SNR. These all proved that LDA detection method has higher noise tolerance and better accuracy. The accuracy of directly clustering without detection method is lower than that of using detection method. It is difficult to achieve the purpose of effective signal detection. Therefore, in the face of the characteristics of low SNR of the desert seismic records, it is reasonable and effective to choose the LDA to detect the seismic events and then use the filter to denoise. At the same time, in the case of different input SNR, we quantitatively analyze the output SNR of several selected methods. The result is shown in Fig. 7. According to Fig. 7, we can see that the denoising framework proposed in this paper is the best to improve the SNR. However, when the detection method in this denoising framework is replaced by PCA, the improvement effect of the denoising framework on the output SNR will be reduced. And the output SNR is also influenced by the input SNR more seriously than other methods. Besides, we analyze the mean square error (MSE) of different denoising methods, and the results are shown in Fig. 8. MSE is defined as follows:

$${\text{MSE}} = \frac{{\sum\nolimits_{i} {\sum\nolimits_{t} {\left[ {s\left( {i,t} \right) - x^{\prime } \left( {i,t} \right)} \right]^{2} } } }}{N \times M}$$
(13)

where \(s\left( {i,t} \right)\) is the clean synthetic desert seismic signal, \(x^{\prime } \left( {i,t} \right)\) is the denoisy signal, \(i = 1 \cdots N\) is data point and \(t = 1 \cdots M\) is trace number. It can be seen that the MSE of the method proposed in this paper is the smallest, which is most similar to the clean synthetic desert seismic signal. From this point of view, we know that the denoising effect introduced in this paper is also optimal.

Table 2 Calculation result of the clustering accuracy based on LDA, PCA and K-means in different SNR
Fig. 7
figure 7

Performance comparison of selected method for various input SNR

Fig. 8
figure 8

MSE comparison of selected method for various input SNR

In general, the framework introduced in this paper is very effective in random noise reduction. It is also reasonable to select LDA for effective signal detection.

Real desert seismic record

The real desert seismic record with a total of 101 traces is used to analyze the practical application ability of this framework, as shown in Fig. 9a. Figure 9b describes the result of using the random noise reduction framework introduced in the paper. It can be obtained that the resolution and the continuity of the seismic events are significantly improved. The parts of the effective signal become clearer. The areas where the denoising effect has been significantly improved have been marked with red boxes. Figure 9c shows the result of using non-local mean filter to denoise. It can be seen that the noise is reduced slightly. The denoising effect is not obvious. The resolution of the seismic events is not improved. Their continuity has not been obviously ameliorated. Figure 9d illustrates the result that first using the PCA method to detect the effective signal and then using the non-local mean filter to denoise. The resolution and continuity of the seismic events have not been improved. fx deconvolution achieves acceptable results (Fig. 9e), but this result is inferior to that of Fig. 9b in terms of continuity and clarity of the seismic events. The result of curvelet thresholding is also not ideal, as shown Fig. 9f. Compared with the denoising effect of all the areas marked by the red boxes in Fig. 9, the method presented in this paper can show better denoising performance. Meanwhile, we also compare the difference before and after denoising by using selected methods (Fig. 10). From the removed noise, we can see that the denoising framework can remove the noise thoroughly. In Fig. 10a, there is almost no residual effective signal. In removed noise of other methods, not only the random noise reduction is not complete, but also the effective signal remains (Fig. 10b–e).

Fig. 9
figure 9

Processing results of real desert seismic record. a Real desert seismic records. b Proposed method results. c Non-local mean filter results. d Non-local mean filter results after PCA detection. efx deconvolution results. f Curvelet thresholding results

Fig. 10
figure 10

Residual comparison of real desert seismic record before and after denoising. a Removed noise using proposed method. b Removed noise using non-local mean filter. c Removed noise using non-local mean filter after PCA detection. d Removed noise using fx deconvolution. e Removed noise using curvelet thresholding

In summary, this method is highly advantageous compared with the non-local mean filter and curvelet thresholding. It is also better than fx deconvolution which is the most traditional method of seismic signal denoising. The supervised LDA detection method is better than the unsupervised PCA detection method. It also shows the advantage of supervised LDA in dimension reduction.

Conclusions

In this paper, we have used LDA effective signal detection method to form a framework to reduce random noise of desert seismic record. By learning the two kinds of data in the training set and reducing dimension, the low-dimensional signal data and noise data can be divided accurately in low SNR scenarios and the effective signal data can be extracted accurately. When the same projection direction is applied to the new seismic data, the signal is also divided into two kinds of effective signal and noise. The effective signal is extracted better. Then, the filter is used to denoise, so that the denoising effect is obviously improved. We test the capacity of this framework on both synthetic and real desert seismic record. Compared with conventional methods, such as using non-local mean filter directly, fx deconvolution and curvelet thresholding, this method can achieve better results. When the LDA detection way is replaced by PCA, the results become worse. In conclusion, the desert seismic record noise reduction method based on LDA effective signal detection can accurately detect the effective signal and finally obtain good denoising effect. Using machine learning to process seismic signals is a new idea. In future work, we will try to find a classification algorithm to reduce the desert seismic random noise, rather than relying on filters. Of course, we can also find new features in the transform domain to classify seismic data and ultimately achieve noise removal.