Keywords

1 Introduction

Picking up the first arrivals of seism has always been a basic and important link in the process of seismic data processing. When the effective static correction cannot be obtained by using the reflected wave, the surface velocity and thickness information can be extracted from the first arrival wave, and then the static correction of the shot point and the detection point can be obtained. Therefore, how to accurately pick up the position of the first arrival wave is very complicated. The refracted wave static correction, VSP interpretation, shallow refracted wave exploration and tomography exploration in the work area are of great significance. At the same time, the acquisition of field seismic exploration data mainly adopts the acquisition of three-dimensional high-density and large-volume data with multiple coverages. The subsequent processing involves the processing of three-dimensional seismic data and large-scale two-dimensional seismic data, and the picking and correction of first arrival waves will consume the researcher’s a lot of time, and the picking efficiency is low. In the case of a large amount of data, a tight construction schedule and a small number of manpower, it is difficult to meet the timeliness requirements in production. The processing personnel may often inadvertently appear the phenomenon of tandem phase in the first pick-up, thus affecting the static correction effect. Therefore, how to automatically and accurately pick up the first arrival wave, reduce the workload of manual correction and picking, and improve the efficiency and accuracy of seismic data processing, is particularly important in seismic exploration [1].

With the development of seismic exploration technology, the automatic first arrival wave picking technology is also constantly developing and maturing. Early first-arrival picking methods are mainly based on the overall characteristics of seismic record data, such as linear least squares prediction method, correlation method, etc. These methods can suppress noise to a certain extent, but due to the influence of some factors such as the correlation of seismic traces, the picking effect is relatively poor in areas with complex surface conditions. With the development of mathematical tools and computer technology, a method of using the amplitude, phase and waveform characteristics of the first arrival wave to discriminate and pick up the first arrival wave has been formed, mainly including: ① Methods based on the instantaneous characteristics of shock waves, such as wavelet transform mode pole Large value method, etc.; ② Methods based on the time-window attribute characteristics of shock waves, such as curve length method, amplitude method, energy ratio method, etc.; ③ Picking method based on image edge detection, etc. The seismic data processing personnel can choose different first-arrival picking methods according to different exploration areas, as well as the level of the first-arrival signal-to-noise ratio and the background noise. For the first arrivals with high signal-to-noise ratio and weak background noise, the picking accuracy of the above methods can basically meet the requirements of static correction. The complex first-arrival wave with a large amount of data processing, the efficiency and accuracy of the above-mentioned first-arrival wave picking methods are difficult to meet the needs of actual seismic data processing. Therefore, in the actual seismic data processing, the first arrival wave picking in complex areas mainly relies on manual picking and correction.

With the rapid development of computer hardware, deep learning technology based on supervised learning model has shown strong performance in classification and detection tasks in the field of image processing. It has broken through many bottlenecks that cannot be solved by traditional methods, and has been successfully applied in many industries [2,3,4,5,6,7,8,9,10]. Transfer learning model is especially important for first-arrival picking situations where the data are similar and the training takes a long time. Transfer learning model based on deep learning is the current frontier development direction of machine learning. After supervised learning - transfer learning will be the next driver for the success of ML business [11]. According to the characteristics of seism first arrivals, if the transfer learning model is adopted, the work efficiency will be greatly improved. Using the idea of transfer learning, models trained on big data can be transferred to new tasks. Fine-tune for new tasks so that you can also have models trained on big data. Going a step further, these models can be adaptively updated for our task to achieve better results. The core issue of transfer learning is how similar the target domain is to the source domain. If the similarity between the source domain and the target domain is high, transfer learning can be applied, otherwise, there is no similarity between the two domains, or basically dissimilar, then the effect of transfer learning will be impaired. The automatic picking of first arrivals means that the seismic data collected in the work area where the source domain model is constructed are similar to the seismic data collected in the target domain.

For solving the difficulty of predicting seismic data with low signal-to-noise ratio, and long data processing and model building and prediction time, and for constructing a high-precision identification model, an artificial intelligent method which is based on U-Net convolution neural network is introduced. Using the transfer learning method, an in-depth analysis and research are conducted on the influencing factors of the accuracy of the deep learning model for picking up the seismic first arrival waves.

2 Factors Analysis

The intelligent picking up first arrivals is a complex system work, which needs to consider many factors, including the factors related to the work area and the setting of parameters related to the supervision and transfer model of the convolutional neural network. In view of complex surface conditions, through data analysis and literature review, the data, algorithm selection, algorithm parameters and other aspects are analyzed, and the relationship between the seismic data characterization of the experimental work area and the intelligent picking effect is analyzed, and the impact is determined. The factors related to the test area mainly include the design of the complex surface observation system, the type of the surface, the selection of the excitation method, the selection of the excitation conditions, the selection of the excitation parameters, the investigation of the interference wave and the reduction of the influence of the interference wave, the selection of the detector type, the detector The combination method, drilling quality of the gravel area, distance, first arrival location, etc. These all affect the quality of seismic data. Among them, the main ones include four aspects: surface type, source type, distance and first arrival location.

2.1 Surface Type

Complex surface seismic exploration is a difficult, high-risk and high-investment petroleum seismic exploration. In order to obtain seismic exploration profiles with high signal-to-noise ratio and high resolution under complex surface conditions, the problem of static correction for complex surface seismic exploration must be solved first.

Mountain Type.

The first is similar complex surface conditions, such as mountains, deserts, and loess sources. But the same mountains are different. Although the mountains in Xinjiang and the mountains in Sichuan are both mountains, the complex surface conditions of the mountains in Xinjiang are simpler, and the surface of Sichuan is more complex, with rivers, factories, towns, roads of various levels, and more interference. It will eventually cause a difference to the original cannon set data. The exact definition of a mountain is a large slope, a high altitude, and a large change in the inclination of the surface rock formations, some of which are approximately upright, as shown in Fig. 1(a). Most of them are located in areas with strong tectonic compression and uplift, often with exposed bedrock, exposed strata, discontinuous or even missing refraction interface, inclined distribution of rock layers of different ages, and dramatic changes in lateral space. These typical mountain features seriously reduce the coupling effect between the detector and the surface, and there are often severe side reflections and multiple refraction waves, and there are multiple first arrivals on a single gun. Mountains generally have the following common characteristics:

The terrain of the mountain is steep, the terrain is undulating violently, the gullies are vertical and horizontal, and all kinds of gullies are extremely developed. In an upright shape, it is extremely difficult for field construction operations. Because the relative height difference is often more than 200 m, and the terrain fluctuation is large, it causes serious static correction problems. As shown in Fig. 1(b), the typical mountain gun records, the first arrival wave has been distorted due to the large surface elevation drop, and it is difficult to pick up, especially the long offset, it is difficult to identify the first arrival wave.

Fig. 1.
figure 1figure 1

Schematic diagram of mountainous landform and first arrival waves.

The weather in some areas of the mountain is harsh. The mountains above a certain altitude even have snow all year round and the climate is cold. The thickness of the snow on the ground is very large, and the error of field elevation measurement is also very large. It is very difficult to collect and construct. In some places, it cannot be completed by normal construction methods. In addition, the thickness and softness of the snow are not the same, which will cause inoperable problems. Avoided static correction problems.

The type, thickness, and velocity of the ground rock vary greatly vertically and horizontally, and there is no good refraction interface. Because the thickness and velocity of the ground surface vary greatly along the lateral low-velocity zone, there are even rock formations exposed in some places. In addition, the seismic wave energy absorption and attenuation are serious, there is no low-velocity layer, and the excitation and reception effects are poor.

Loess Plateau Type.

The loess plateau is a special landform caused by dry inland climate and strong wind. Because the loess plateau has not been compacted and is relatively soft, it is easy to be transformed by rivers and ground runoff, so its main characteristics are as follows:

Loess is divided into two types: primary and secondary. The primary is relatively dense, and some can form layers. The secondary is re-deposited by wind and water flow, and is relatively loose and porous. Usually, the composition of loess is unevenly distributed, often mixed with gravel layers, red cement, etc. When they alternate in arrangement, the seismic signal characteristics become very complex, and the lateral energy of the first arrival is very different.

The loess is very thick. In the deep stratum, the thick loess is not layered, but more like a continuum. However, in the rainy season, due to the action and sorting of rainwater, the loess has a strong layering, and some characteristics of layering can be seen, as shown in Fig. 2(a); the loess layer is not a structurally homogeneous body in space, and structural fissures are developed, which can be form a strong source of scattering interference. It causes serious interference to the first arrival wave, increases the difficulty of picking up the first arrival, reduces the picking efficiency and accuracy, and affects the calculation of the static correction amount.

The topographic changes in the loess-covered area are relatively complex, the phreatic surface is deeply buried, and the lateral velocity varies greatly. In very thick loess areas, up to 300–700 m deep. If the surface is relatively dry, the speed of such loess is relatively low, and sometimes the downward speed of the surface changes irregularly with the increase of water content. The lateral velocity and thickness of the loess layer fluctuates in the near-surface layer. When the seismic wave passes through the loess layer in the low-velocity zone, the event axis of the reflected wave is distorted, causing serious static correction problems. As shown in Fig. 2(b), due to the drastic changes in the surface elevation, the thickness of the low-velocity zone, and the lateral velocity of the surface, the amplitude, phase, and waveform characteristics of the first arrival wave vary greatly, which is difficult to pick up accurately, which seriously affects the calculation accuracy and effect of static correction.

Fig. 2.
figure 2figure 2

Schematic diagram of loess plateau landform and first arrival waves.

Desert Type.

The formation of deserts is very similar to the formation of loess plateaus, both because of the effect of wind under extremely dry climate conditions, and are generally distributed in the center of relatively large dry basins. In the desert, various dune formations are the result of wind changes and modifications. Its specific characteristics are:

Because the climate is dry and hot all the year round, the average thickness of aeolian sand accumulation may be greater than 40 m, especially in the ravine area. The average thickness of the low-velocity zone is thin because the surface and shallow sands are not compacted, but are only very loose particles. Most deserts have a deep diving surface, and the diving surface is relatively undulating and unstable. Because the water supply is relatively small and the supply is uneven, there are few stable high-speed roofs in the desert. This brings difficulties to the velocity inversion of the near-surface model. After the velocity inversion, the high-speed top interface is not clear, and the high-speed top is difficult to accurately identify, which affects the static correction calculation.

Sand dunes of various shapes are the main surface features of the desert, and are the morphological structural units that make up the desert, but all its changes are controlled by the strong wind, as shown in Fig. 3(a). Since sand dunes are easily transported and transformed by the wind at any time, it is very difficult to measure the surface elevation, and the accuracy of the surface elevation is affected, which seriously affects the calculation of field static corrections.

The terrain staggered cutting in the desert is not severe, but sometimes there are large fluctuations in local sections; in addition, due to the action of wind, the windward slope is relatively hard and the density is high, while the leeward slope is relatively soft and low in density. Such a surface sand velocity is likely to cause some short-wavelength static correction problems. Figure 3(b) shows the two arrangements of the gun records after linear correction in the Xinjiang desert. After the linear correction, the time difference at the receiving point caused by the difference of the offset is eliminated, so that the reflected and refracted wave characteristics of the complex desert surface on the gun record can be seen more clearly. From the arrangement on the left side of the figure, we can clearly see the wave-like fluctuations of the first arrival wave due to the influence of the dunes, while the right side of the arrangement has some offsets due to the large fluctuations of the dunes, coupled with the influence of the speed and thickness of the low-speed belt, the first arrivals are severely distorted, causing serious short-wavelength problems.

Fig. 3.
figure 3figure 3

Schematic diagram of desert landform and first arrival waves.

Gobi Type.

The Gobi is generally not high in terrain, with small ground undulations and a large number of gravel blocks, which are generally distributed at the edge of the basin, as shown in Fig. 4(a). The specific features are as follows:

Alluvial fans are mostly distributed on the edge of the basin, and the surface velocity anisotropy is unstable. Due to sorting, rocks of different ages are distributed staggered. Coupled with frequent tectonic activities, it has the characteristics of short transport distance and rapid deposition under the geomorphic background conditions with relatively large topographic fluctuations.

The wind-eroded mound is a typical wind-eroded landform, mainly because the fine-grained sediments exposed on the surface are affected by strong wind, and the surface is usually covered with loose sandy soil. The Yadan landform is a typical product of differential wind erosion, which is composed of dense and non-dense interlayers. Surface excitation of such landforms produces many secondary disturbances.

The surface gravel layer in the Gobi gravel area is usually very loose and dry, and is subject to strong weathering. From the overall analysis, it can be seen that due to poor sorting and staggered distribution of gravels, there is only weak cementation between gravels, and only unstable layers can be formed, and some even have no stratification. This seriously affects the downward propagation of seismic wave energy, and at the same time, different interference waves are often generated, and the noise energy is relatively enhanced. In this way, it is difficult to accurately pick up the first arrival wave due to noise interference, thus affecting the calculation accuracy of static correction. As shown in Fig. 4(b), on the right side of the record of the Gobi linear correction, due to serious interference, the first arrivals have been difficult to identify and pick up, and some low-frequency first arrivals appear on the far right side. This phenomenon proves that the near-surface velocity varies greatly laterally, and a thick low-velocity zone appears. This phenomenon is generally the gullies connecting the Gobi and the mountains, which belong to the piedmont zone. Such exploration areas generally have serious problems. The static correction problem is difficult to solve.

The accumulation thickness of the gravel layer in the fan gradually decreases from the fan root to the fan end to the fan end. Due to the influence of the edge structure, the alluvial fan bodies overlap each other, and the fan body structure is not uniform and has no stratification. Therefore, the velocity and thickness of the low surface near the surface change drastically in the three-dimensional direction, and the surface modeling difficulty increases, which seriously affects the calculation effect of static correction.

Fig. 4.
figure 4figure 4

Schematic diagram of gobi landform and first arrival waves.

2.2 Seismic Source Type

In the transfer learning process of the first arrivals, the type of the hypocenter is one of the important comparison factors. The types of artificial sources are divided into explosive sources and non-explosive sources. The explosive source is a chemical substance or chemical mixture, which is considered to be an ideal source because the seismic wave excited by it has the advantages of good pulse characteristics and high energy, and can provide a clear phase first arrival wave. Although the explosive source is an ideal source, its construction is dangerous and the cost is high. The main problem is that in areas where drilling cannot be drilled, areas with severe water shortage (such as desert areas), and thick low-velocity zones (such as loess plateau, etc.) areas, it is very difficult to use explosives, and some areas do not allow the use of explosives to explode. Therefore, seismic exploration has gradually developed non-explosive sources, especially in recent years, this aspect has developed rapidly at home and abroad. Non-explosive sources refer to sources that do not use explosives to excite seismic waves in seismic exploration, and can be divided into: pulse sources (such as electric spark sources, heavy hammers, air guns, etc.) and vibroseis. The energy of a single non-explosive is much smaller than that of the explosive source, and it is safer to use, but the propagation distance of the seismic wave is limited, and multiple sources are required to be used in combination to increase the seismic energy superimposed multiple times, resulting in the peaks, troughs and troughs of the first arrival waves. The jump-off point has a small amplitude and is difficult to automatically identify. Therefore, in the process of transfer learning, two work areas with similar source types should be selected as far as possible to carry out transfer learning.

2.3 Distance Between Source Domain and Target Domain

Corresponding to transfer learning, the spatial location of the source domain and the target domain work area is very important, and it should be as close as possible, which can effectively improve the accuracy of the model calculation. Because of the strong heterogeneity of geological reservoirs, the referentiality of work intervals largely depends on the similarity of work areas, so the spatial location has an important impact on the quality of transfer learning.

2.4 Position of First Arrivals

At present, there are generally three kinds of initial value positions for picking up the first arrival wave: wave crest, wave trough, and take-off point. In the selection and construction process of the transfer learning model, in the case of the same complex surface type and the same source type, it is also necessary to consider whether the picking position of the source domain samples is the same as that of the target domain samples. If they are all picked peaks and troughs, the training effect of the transfer learning model is better.

3 Machine Learning Models

On the basis of the supervised learning model, by freezing most of the weights of the network structure, and only modifying and updating the weights of a small part of the network model, the goal of meeting the prediction accuracy is achieved with the least amount of work and calculation.

3.1 Supervised Learning Model

The intelligent pick-up of the first-arrival wave of the seism, that is, the input of the first-arrival wave data of the seismic information, after passing through the U-Net convolutional neural network, can obtain a result graph similar to the sample, that is, the picking of the first-arrival wave is completed. First arrival picking is a task of image semantic segmentation. The U-Net deep learning model is selected for modeling to determine whether the pixels in each sample are first arrivals.

The Unet model adopts a fully convolutional neural network, which consists of two parts, the first half is the encoder feature extraction network, and the second half is the decoder feature fusion network. (1) The encoder feature extraction network consists of two 3 × 3 convolutional layers (relu) plus a 2 × 2 pooling layer to form a downsampling module; (2) The decoder feature fusion network consists of An upsampling convolutional layer + feature stitching (concatenate the feature map generated by upsampling and the left feature map for concatenate operation) + two 3 × 3 convolutional layers (relu) are repeatedly formed; (3) Finally, after two The second convolution operation generates a feature map, and then uses two convolution kernels with a size of 1 × 1 for classification, and then uses it as the input of the softmax function to calculate the softmax with a relatively large probability, and then performs loss and back propagation calculation [7,8,9].

This structure is to first perform multi-layer fully connected convolution and pooling operations on the input picture, and pool 4 times in the U-Net encoding stage. The initial picture pixel is 1024 × 192. After successive pooling, the picture size changes. It is 512 × 96, 256 × 48, 128 × 24, 64 × 12 features of four different sizes. Then we perform an upsampling or deconvolution operation on the 64 × 12 feature map to obtain a 128 × 24 feature map. This 128 × 24 feature map and the previous 128 × 24 feature map are stitched between channels, and then Convolve and upsample the feature map after splicing to obtain a 256 × 48 feature map, and then splicing, convolving and upsampling with the previous 256 × 48 feature. The same 1024 × 192 prediction result. Then you can use this prediction result to compare with the calibration sample, get the Loss value, perform back-propagation calculation, repeat iteratively, and finally get the trained U-Net deep learning neural network model for inference and prediction applications.

3.2 Transfer Learning Model

The basic idea of transfer learning is based on the supervised learning model that has been trained. On the basis of retaining most of the network model weights, we call these network layers frozen layers, and only modify a small part of the network model weights, that is, adjust The weights of the layers are obtained, so as to quickly obtain a new migration model and meet the requirements of prediction accuracy.

Fig. 5.
figure 5figure 5

Schematic diagram of U-Net model structure of supervised learning and transfer learning for first arrival wave intelligent picking up.

4 Work Flow

The basic process of the intelligent picking technology of seism first arrivals based on convolutional neural network is shown in Fig. 6, which mainly includes the following steps:

Comparison Between the Source Domain and the Target Domain.

From the four aspects of distance, source type, surface type, and first arrival position, the source domain and the target domain are compared and analyzed, and the supervised model with the optimal transfer learning value is selected to modify and train;

Data Collection and Preprocessing.

Collect seismic shot collection SEGY data, analyze and segment the data, and carry out data preprocessing such as data missing value processing and normalization processing;

Build Supervised Learning Sample Library.

Process the first arrival data to form sample data;

Build Supervised Learning Model.

Build a supervised learning neural network model in terms of the size of the convolution kernel and the number of network layers;

Supervised Learning Model Training.

Train the supervised learning model on the training set, and input seismic data and corresponding label data for model training;

Supervised Learning Model Tuning and Evaluation.

The adjustment of algorithm parameters is the process of model tuning, and the model is applied on the validation set and the prediction effect is evaluated;

Transfer Learning.

Apply the trained supervised learning model, establish a transfer learning model, train and tune the transfer learning model, and obtain a usable transfer learning model;

Model Application.

Based on the model that has been trained, forecast data in the work area for forecasting.

The application process to realize the automatic pickup of the first arrival wave is shown in the following Fig. 6.

Fig. 6.
figure 6figure 6

Technical flowing chart of supervised learning and transfer learning.

5 Calculation Process

For the selected experimental work area, the loading of the first-arrival picking data was completed, and a supervised learning first-arrival wave sample library suitable for the experimental work area was constructed. By configuring the algorithm parameters, the model training and parameter optimization were carried out, and the supervised learning model was established. According to the model test The accuracy, recall, error and other parameters of the statistical prediction of the data and the confusion matrix were used to evaluate the prediction results of the model, and the intelligent picking of the first arrivals of the 3D seismic data in the work area was completed.

5.1 Data Collection, Preprocessing and Sample Library Construction

Overview of the Work Area, Data Collection and Analysis. The experimental data for A work area comes from the field 3D seismic explosive source data body collected in a domestic work area in 2018. The total number of shots is 100,300 shots, about 840 million tracks, and the total data volume is 2.4T. It is stored in multiple SEGY format data files. Labels are at 209 bytes. The seismic acquisition layout is implemented in 18 arrays, the number of receiving channels is 8640, the seismic trace duration is 2500 ms, and the acquisition rate is 2 ms, covering 180 times, the bin is 20 × 20 m2, the track spacing is 40 m, the receiving line spacing is 400m, the excitation point spacing is 40 m, the excitation line spacing is 480 m, the maximum non-longitudinal distance is 4780 m, and the maximum offset distance is 7470 m, the aspect ratio is 0.67.

The experimental data for C work area comes from the field 3D seismic explosive source data body collected in a domestic work area in 2019. The total number of shots is 43,000 shots, and the total data volume is 1.1T. It is stored in multiple SEGY format data files. The seismic acquisition layout is implemented in 18 arrays, the number of receiving channels is 6480, the seismic trace duration is 2500ms, and the acquisition rate is 2 ms, covering 180 times, the bin is 20 × 20 m2, the track spacing is 40 m, the receiving line spacing is 280 m, the excitation point spacing is 40 m, the excitation line spacing is 360 m, the maximum non-longitudinal distance is 3340 m, and the maximum offset distance is 6332.5 m, the aspect ratio is 0.62.

Through the analysis of the original data, it is found that the altitude of these two work areas are between 320 m and 700 m. Due to the poor surface landform, the signal-to-noise ratio of the original data is high. The local elevation changes in the northwestern part of the work areas are large, and the thickness and velocity of the surface low-velocity zone vary greatly vertically and horizontally, and the static correction problem is serious (Table 1).

Data Preprocessing and Sample Library Construction.

Data Set Cutting and Data Missing Value Processing. Read the SEGY file and delete the track data without the first arrival time. According to the read amplitude data, the SEGY seismic volume data is sequentially cut into samples with a width of 192 pixels and a height of 1024 pixels, and label data is created according to the existing first arrival time. When making samples, if there are not enough 192 tracks, the track complement operation is performed, and the insufficient tracks are filled with zeros, and then a sample is formed for prediction (Fig. 7).

Fig. 7.
figure 7figure 7

Schematic diagram of seismic information set data cutting.

Sample Labeling Method.

Try to use three labeling methods, one is to label at the first arrival position, the other is to label the three neighborhoods above and below the first arrival position, and the third is to label below the first arrival position. The three labeling methods were tested respectively, and finally the third method of labeling below the first arrival position was selected.

Proportional Division of Training Sample Set, Verification Sample Set and Inference Set Data.

A total of 100,300 shots were collected in the experimental work area, of which 743 shots were used as the training set, accounting for 0.74% of the work area volume. A total of 2,000 samples were produced, of which 1,600 1 sample data is used for the training set, and 400 sample data is used for the validation set. The remaining samples of all SEGY files are used for the test set.

5.2 Building Supervised Learning Model

This experiment uses the U-Net structure multi-layer deep learning neural network basic model to initialize the calculation parameters in the model, where:

Initial Setting of Convolutional Layer Parameters.

The convolutional layer of each layer of network initially adopts a uniform 7 × 7 size convolution kernel. After each convolutional layer, the activation function of the rectified linear unit is used, the activation function of the last convolutional layer adopts the sigmoid function, and the remaining convolutional layers use the elu activation function;

Initialization Settings of Pooling Layer Parameters.

After connecting to the convolutional layer, the size of the first layer of pooling kernel is 2 × 3, and the size of the other layer’s pooling kernel is 2 × 2. The maximum pooling method is used, and the step size is 2;

Initialization Settings of Output Layer Parameters.

A convolution kernel with a size of 1 × 1 is used for deep supervision, the output result is the classification result of the sampling points of the amplitude value of the seismic record, and the loss function uses the Lovasz function;

Other Parameter Initialization Settings.

The weight parameter uses a small random number, the bias value is initialized to 0, the batch size is set to 8, the network learning rate value is set to 0.0001, and the number of model training iterations is set to 100.

5.3 Supervised Learning Model Evaluation

This paper uses Mean Intersection over Union (MIoU) as the model evaluation index. MIoU is a measure of the accuracy of detecting corresponding objects in a specific dataset. The overlap ratio of the original labeled bounding box and the predicted bounding box generated by the U-Net model, that is, the ratio of their intersection and union, is MIoU. The ideal situation is complete overlap, i.e. a ratio of 1 (Fig. 8).

$$MIoU=\frac{1}{k}\sum_{i=1}^{k}\frac{P\cap T}{P\cup T}$$
(1)

where: k is the total number of samples; i is the number of the ith sample; P is the model prediction value set; T is the real value set, and the relationship between P and T set is shown as follows:

Fig. 8.
figure 8figure 8

Schematic diagram of the relationship between the model predicted value and the actual value set.

At the same time, the model will output the Loss curve of the training set and the validation set during the training process. Under the current sample, whether the model meets the requirements can be judged by the difference between the Loss curves of the training set and the validation set. Two situations usually occur, namely model overfitting and model underfitting.

Model Overfitting.

The trained model performs well on the training set, but performs poorly on the validation set. At this time, the Loss value of the training set is often small, but the loss value of the training set is the same as that of the training set. The test set Loss values are quite different.

Model Underfitting.

The trained model performs poorly in the training set and also performs poorly in the test set.

In the process of model tuning, the Loss value and MIoU curve of the training sample set and the verification sample set will be continuously compared, so as to continuously adjust each parameter, and finally obtain the optimal neural network model.

5.4 Supervised Learning Model Training and Tuning

The parameter tuning of the model is to use the model to fit the data, and to tune the key parameters of the model according to the error calculated from the training set and test set data. The U-Net algorithm model contains multiple hyperparameters, such as the size of the convolution kernel, the number of network layers, and the learning rate. Different parameters and parameter combinations will affect the accuracy of the model. Based on the sample data of the test work area, this paper uses the U-Net algorithm to change the size of the convolution kernel, adjust the number of network layers, change the learning rate, increase the residual block, change the number of training samples, change the loss function, adjust the optimizer, and increase the dropout. And more than ten indicators such as L2 regularization mechanism, the model is tuned according to the model training Loss curve and MIoU curve.

Based on the above tuning process, the influence of each hyperparameter on the model accuracy was simulated and calculated. The final optimal U-Net convolutional neural network hyperparameter combination is: the convolution kernel size is 3 × 3, the learning rate is 0.0001, and the number of layers is 4, the sample batch is 4, the optimizer is adam, the activation function is elu (the activation function of the last layer is Sigmoid), the loss function is lovasz, the number of iterations is 100, the dropout rate is set to 0.5, and the number of set iterations is 5, and the number of training set samples is increased to 2000. We can see that from Fig. 9, after adding low signal-to-noise ratio samples, the accuracy is better than before. This batch of samples is uniformly extracted from the overall seismic data volume file, and the number of samples with low signal-to-noise ratio is increased.

Fig. 9.
figure 9figure 9

Comparison of results before and after increasing the number of samples with low signal-to-noise ratio.

In order to evaluate the generalization of the training model in the experimental work area, the work area training model was used to predict the validation set. The verification results are shown in Fig. 10, and the prediction accuracy reached 92.00%. From the prediction results of the validation set, it can be seen that the model has learned to extract the characteristics of the first arrival wave of the seism from the training set, and can predict the location of the first arrival wave well, which has the effect of large-scale generalization application.

Fig. 10.
figure 10figure 10

Validation set first arrival wave picking up prediction result display.

5.5 Transfer Learning Model Training and Tuning

Transfer learning is divided into two parts, one part is to transfer the source domain model, and the other part is to freeze some layers of the source domain model with new data and retrain the source domain model. The first few layers of the pre-trained model have shallow features such as textures, edges, and contours, freeze and train specific downsampling layers of the pre-trained model, fine-tune other layers, and perform model training and parameter optimization.

  1. (1)

    Establish a sample database, 600 sample data are used for the training set, and 100 sample data are used for the validation set. The samples of all the remaining SEGY files are tested;

  2. (2)

    Set the number of frozen layers to 13 layers, that is, apply the shallow features extracted from the first 13 layers of the U-Net model of supervised learning, and train the deep features of the model after 13 layers;

  3. (3)

    Set the optimization function to Adam function;

  4. (4)

    Set the learning rate to 0.0001;

  5. (5)

    Set the loss function to the lovasz function;

  6. (6)

    Set the number of iterations to 100.

Through the tuning experiments of the above methods, the model training situation is judged according to the loss curve until the model training is successful. The final preferred U-Net transfer learning convolutional neural network hyperparameter combination is: the number of frozen layers is 10, the learning rate is 0.0001, the sample batch is 16, the optimizer is SGD, and the activation function is elu (the last layer of activation function). Sigmoid), the loss function is MSE, the number of iterations is 50, and the number of training set samples is increased to 900 (evenly extracted from the overall seismic data volume file, and the number of samples with low signal-to-noise ratio is increased) (Fig. 11).

Fig. 11.
figure 11figure 11

The first-arrival wave picking prediction result of the transfer learning validation set in the C work area is displayed.

6 Forecast Results and Analysis Discussion

Using the supervised training model constructed by the samples of the experimental work area, the first-arrival picking application prediction was carried out on the test set data. The prediction accuracy results of different SEGY seismic data files are shown in Table 2.

Through statistics, the average accuracy rate is 91.61%, which meets the accuracy required by normal production.

Figure 12 shows the first-arrival picking results of some predicted data, and the red line in the figure is the first-arrival result picked up by the U-Net supervised neural network model. It can be seen from the picking results that the method can pick up the first arrival point very accurately, and has a high degree of fit with the seismic records, indicating that the method has a good picking effect for the first arrival picking of low signal-to-noise ratio seismic data.

Fig. 12.
figure 12figure 12

Display of the first arrival wave picking results of some forecast data (the red line is the forecast result).

The method in this paper is compared with the traditional method in terms of time-consuming, and the results are shown in Table 2. The time-consuming data comparison in the table is obtained through multiple experimental explorations during the research process. From the comparison of the data in the table, under the current experimental hardware support conditions, the method is effective in model training and model reasoning. From Table 2, we can see that the precision is 90% for transfer learning model. Although the C work area is 20 km away from the A work area, the first arrival position of the pickup is the same, which is the take-off point, so the result precision is better.

Table 1. Comparison between different work areas.
Table 2. Time-consuming comparison table between the method in this paper and the traditional interaction method.

7 Conclusions

For the selected experimental work area, the loading of the first-arrival picking data is completed, and a supervised learning and transfer learning first-arrival wave sample library suitable for the experimental work area is constructed. The supervised learning and transfer learning models were developed, and the model prediction results were evaluated according to the parameters such as the accuracy, error and other parameters of the statistical prediction of the model test set data and the confusion matrix. The supervised learning and transfer learning of the 3D seismic data in the work area were completed. The first arrival wave is picked up intelligently. The conclusions are as follows:

  1. (1)

    Aiming at the problems of complex surface types in the work area, diverse sample data sources, poor signal-to-noise ratio of some seismic data, and large amount of data sorting, which lead to the difficulty of first arrival waves, a U-Net deep learning neural network is established. Supervised learning and transfer learning models have formed a set of technologies for data preprocessing, model construction, model training and reasoning. The experiments in the actual work area have proved that the method in this paper can greatly improve the prediction efficiency on the premise of ensuring the prediction accuracy. Oilfield to carry out promotion and application.

  2. (2)

    A set of overall technical response strategies for the work area has been formed in response to the problems of different data characteristics on different surfaces, difficulty in constructing high-precision models, difficulty in predicting data with low signal-to-noise ratios, and long time in establishing data sorting models and forecasting. The extraction should be evenly distributed in the data volume of the whole work area, including a variety of complex surface data features as much as possible, to ensure the number of samples with low signal-to-noise ratio, and to use multi-device, multi-task, multi-process parallel processing tasks, in order to accurately and efficiently pick up low signal-to-noise ratios.

  3. (3)

    Through the actual experiment of this paper in a domestic industrial area, a set of experience in hyperparameter adjustment has been formed, including convolution kernel size, learning rate, number of network layers, sample batches, optimizer, activation function, loss function, dropout mechanism, limiting the number of iterations of the validation set, increasing the number of training set samples, etc.

In summary, the method in this paper makes full use of the respective technical characteristics of deep learning and seismic data processing, and is very suitable for intelligent first-arrival picking under the condition of massive data, providing technical support for large-scale application of seismic data first-arrival calculation of static corrections..