Keywords

1 Foreword

1.1 Research Background and Significance

With the rapid development of modern science and technology and the significant improvement of national economic level, and the safety of high-speed railway perimeter has been promoted to a new height. It has become an important content of modern traffic security work to be able to timely and accurately identify the intrusion targets of high-speed railway perimeter.

However, due to the large track space of high-speed railway and the imperfect infrastructure protection facilities around the track, it is impossible to use intrusion target detection devices on a large scale, resulting in serious traffic accidents caused by the intrusion of vehicles, non motor vehicles, pedestrians and other foreign matters around the railway. It can be seen from Fig. 1 that there are objects or human intrusion around the track, which has a certain impact on the safety of high-speed train operation.

Fig. 1.
figure 1

People intrude into the railway track

It can be seen that the current high-speed railway perimeter intrusion target recognition technology has defects, and the research on the track perimeter intrusion recognition technology is becoming more and more difficult. It is urgent to improve and improve it to reduce casualties, economic losses, and ensure the safety and order of railway operation.

1.2 Research Status at Home and Abroad

1.2.1 Abroad Research Status

Houssam Salmane and others are committed to realizing intelligent video monitoring of level crossings. By detecting, separating and tracking moving objects at intersections, they establish the trajectory of moving objects, stop the dangerous state of the detected objects, estimate the risk level of the target, and achieve the prediction and evaluation of various traffic accidents [1].

Alin Achim proposed a new combined detection method based on pixel statistics, which combines the Gaussian mixture model with the block based detection technology, and can effectively detect moving objects in the measured area [2].

TB Nguyen proposed a real-time monitoring gait detection method based on embedded technology, using histogram and selective search technology, and using background difference method and mixed Gaussian model to limit candidate areas [3].

Ciresm D proposed a target detection algorithm based on DNN, which was completed by using a parameterized CPU. Without additional feature extractor, the traffic signal can be quickly recognized [4].

1.2.2 Domestic Research Status

Liu Yang of the Third Railway Exploration Institute and others proposed a non-contact target detection method, which uses the dual grid power system to determine whether there is foreign matter intrusion around the railway through the signals sent by sensors, but is vulnerable to misinformation due to the impact of external environmental factors [5].

Li Feng from the University of Science and Technology of China proposed a target detection method for pedestrian movement in surveillance video. Using mixed Gaussian model and directional gradient, moving objects in the video image area are segmented, and the detected moving objects are divided into non human and human.

Qu Jianming of Xi’an University of Science and Technology proposed an algorithm for target detection of pedestrians and moving vehicles on the road. He combined AdaBoost classifier and directional weighting to classify moving targets in the pixel area of the image, which can recognize the posture of pedestrians on the road and track and detect moving vehicles [6].

1.2.3 Analysis of Research Status

At present, most of the railway tracks in our country use contact target detection technology, but because of its high equipment costs, the equipment cannot withstand the interference of external environmental factors, and cannot achieve large-scale coverage around the railway tracks. And the current paper only analyzes the performance of perimeter intrusion target detection in the depth learning method, which is aimed at target detection for pictures. It does not take into account the actual application functional requirements of high-speed railway perimeter intrusion target recognition, and lacks a high-speed railway perimeter intrusion target recognition system with fusion algorithm, which can be put into practical application and has clear functions.

1.3 Main Contents of This Article

The research work of this paper is to use the target recognition method based on deep learning to detect high-speed railway perimeter intruders, that is, select the deep learning algorithm model and train it to achieve accurate positioning and classification of intrusion targets, and establish a set of high-speed railway perimeter intrusion recognition system based on deep learning method, which can be put into practical application, improve the safety of high-speed railway driving, and provide guarantee for the normal operation of the railway. The experimental environment designed in this paper is the surrounding area of the railway with a good vision, and the appropriate deep learning network is selected for training to achieve the purpose of accurate identification of the perimeter intrusion target.

2 Target Detection Algorithm Based on Deep Learning

2.1 Single State Target Detection Network

The typical algorithms based on deep neural network are YOLO (You Only Look Once) series algorithms, which are mainly used for target recognition and location. They run fast and can be applied to target detection in real-time systems. YOLO network mainly includes three parts: Part I: Backbone convolution neural network, which can accurately aggregate various images to form image features; The second part: Neck, a group of hybrid networks containing multiple image features, transfers these features to the prediction layer; Part III: Head, image feature prediction, edge box generation and target classification prediction [7].

This algorithm is a typical single-stage target detection algorithm. Its detection speed is faster than that of two-stage detection algorithm, and it greatly makes up for the fatal disadvantage of the lack of accuracy of single-stage algorithm and greatly improves the detection accuracy. The experimental data shows that YOLO V4 algorithm has significantly improved both in speed and accuracy. On the COCO dataset, using YOLO V4 for target detection, its average detection accuracy (AP) has increased by 10%, and the frame rate (FPS) of target detection has increased by 12%. Combining the significant advantages of the deep learning target detection method in recent years, YOLO V4 is combined with the YOLOV3 algorithm network [8]. YOLO V4 performs well in the field of target detection, with a detection speed of 65FPS per second. The enhanced feature extraction network in YOLO V4 is upgraded through the Feature Pyramid Network (FPN) adopted by YOLO V3 and combined with the Spatial Pyramid Pool (SPP); The prediction network still adopts YOLO Head in YOLO V3, while DIOU_ NMS is used to filter the prediction box to finally generate a model framework of “CSPParknet53 + SPP_PANet + YOLO Head”. Figure 2 is the algorithm schematic diagram of YOLO V4 [9].

Fig. 2.
figure 2

Network architecture diagram of YOLO V4

2.2 Two-Stage Target Detection Network

Compared with the traditional target detection algorithm, the two-stage target detection algorithm has the same target detection process. First, obtain the target candidate box, then extract features from the candidate box, and finally generate the detection results. The commonly used two-stage target detection networks are R-CNN and Fast R-CNN and Fast R-CNN based on it.

Faster R-CNN adds a neural network edge extraction algorithm to find candidate boxes. This algorithm includes PRN candidate block extraction and Fast R-CNN detection. PRN is a full convolution neural network. Faster R-CNN extracts candidate blocks from PRN and performs target recognition in the candidate block area [10]. On this basis, the four basic steps of target recognition (generating candidate regions, extracting features, classifying, and positioning refinement) are integrated into a deep network architecture.

To sum up, we can see that a typical two-stage target detection algorithm, Faster R-CNN, divides the target detection process into two stages [11]. The first step is to generate a candidate region containing the approximate location information of the target, and the second step is to fine adjust the classification and specific location of the target in the region. The recognition error rate of this method is low, but the speed is very slow. It is difficult to achieve the real-time detection effect that high-speed railway perimeter intrusion target recognition wants. The prediction speed is stable at 0.7 fps. If the parameters are adjusted and the number of candidate boxes is reduced, the accuracy rate will drop by about 4%, and the speed can reach about 2 fps. But this speed is still very slow for real-time detection [12]. YOLO V4 algorithm is much faster than the R-CNN system, and it is kept within the range of 2.4 fps. In addition, the feature extraction layer of YOLO V4 adopts the structure of feature pyramid and down sampling. Mosaic is used for data enhancement during training, so it can also achieve good results in small target detection.

This paper mainly detects the intrusion targets around the high-speed railway [13]. The existing intelligent video analysis system is suitable for the intrusion target recognition within 0–100m. However, for the detection of the 100−200m intrusion targets, namely “far small” targets, there is still a problem of insufficient detection accuracy, false positives, missing reports, etc.

3 Experiment Data Set

3.1 Definition of Small Target

In the surrounding environment of high-speed railway, the identification of small targets is a very important link, which plays a vital role in ensuring the safety and order of train operation. In the standard COCO dataset, the pixel area is used as the division unit. The definition of small targets is shown in Table 1.

Table 1. Definition of small target in standard COCO dataset

This paper focuses on solving the “far and small” intrusion targets at the perimeter of high-speed railway, as shown in the framed targets in Fig. 3. Compared with ordinary pictures, such targets occupy less pixel areas in the image, and the target is fuzzy and easy to be overlooked. It is relatively difficult to identify the target objects in such images, and it is easy to miss detection and false detection. The performance requirements for target detection algorithms are more stringent [14].

Fig. 3.
figure 3

Schematic of far and small invasion targets

3.2 Data Set Construction

3.2.1 Dataset Content

The high-speed railway perimeter intrusion data in this paper is collected from the monitoring video of a high-speed railway perimeter. The filtered video is divided into image format, data amplification and processing, data classification, and image annotation. Finally, the intrusion data around the high-speed railway is obtained.

3.2.2 Dataset Tags

After the video framing, filtering and amplification operations are completed, the data in the data set has reached 420 * 7 = 2940 pictures in total. The test set, verification set and training set are selected according to the proportion of 8:1:1. The selection method of each set and the number of selected pictures are as follows:

  1. (1)

    According to the same interval sequence, 294 images are selected from the data set as the verification set, and 2646 images remain;

  2. (2)

    336 images of each type after data set amplification are selected as training sets. There are 2352 test sets in total, and 294 images remain;

  3. (3)

    All the remaining pictures are used as the test set, 294 pictures in total;

The images of test set and training set are marked with LabelImg software.

4 Analysis of Experimental Results

Under the same data set and experimental conditions, the calculation results of YOLO V4 algorithm and Faster R-CNN are compared in terms of evaluation indicators. The loss value of the YOLO V4 target detection algorithm selected in this paper continues to decrease during the training process. After the loss value data is extracted and visualized, the loss image can be seen as shown in Fig. 4.

Fig. 4.
figure 4

YOLO V4 loss function curve image

As shown in Figs. 5 and  6 below, YOLO V4 can accurately identify two intruders within the high-speed railway perimeter in the picture under the same data set and hardware facilities training conditions, and mark the category attributes of the target classification, such as “person” and “car” in the picture. After Faster R-CNN network training, the intruders will be identified, and the intruders in the picture are relatively close, Faster R-CNN can also carry out accurate marking and classification.

Fig. 5.
figure 5

YOLO V4 close range recognition

Fig. 6.
figure 6

Faster R-CNN short range recognition

As shown in the figure below, YOLO V4 can accurately identify small targets that are far away, but Faster R-CNN has a poor recognition effect on small targets and generates two candidate boxes for the same target in case of false positives.

Fig. 7.
figure 7

Faster R-CNN remote recognition

The performance of the detection results of the two algorithms is analyzed, and the established target recognition evaluation indicators are used for comparison under a unified metric, as shown in Figs. 7 and 8 below, which is the recall index curve of Faster R-CNN. Two different categories “person” and “car” are calculated respectively. Therefore, the average value of both is taken when calculating the recall value of the algorithm, and the other indicators are calculated using the same method, The specific data of target recognition effect comparison between the two algorithms are shown in Table 2.

Fig. 8.
figure 8

Recall performance of Faster R-CNN algorithm

Table 2. Comparison of target recognition effects

In conclusion, compared with Faster R-CNN, YOLO-V4 has great advantages in all aspects except its large model structure. Especially in terms of accuracy, Faster R-CNN, as a two-stage model, sacrifices the detection speed of intrusion targets to improve the detection accuracy. Although the detection effect recall rate is good, the high error detection rate causes the imbalance between the two, YOLO V4 algorithm achieves a good balance in real-time, accuracy and recall, and can well meet the requirements of high-speed railway perimeter remote small intrusion target recognition on real-time and accuracy. Therefore, when building the far and small intrusion target recognition system for the high-speed railway perimeter, we choose to integrate YOLO V4 algorithm to design the system.

5 Remote and Small Intrusion Target Detection System for High Speed Railway Perimeter

Click the “Select Image” button, jump out of the folder and select the image. The selected image file will appear in the left box. Click the “Start Identification” button, and call the trained YOLO V4 algorithm to identify the intrusion. The identification results are shown in Fig. 9. The types of intrusion will be displayed under the “Target Identification Results” at the bottom right, It will also frame the position of the entry and exit objects in the image to be identified and mark the type of intrusion at the location.

Fig. 9.
figure 9

Display of image target recognition results

First, click the “Select Video File” button, jump out of the folder directory and select the video file, as shown in Fig. 10 below. The QTimer timer starts, the background will process the video file into an image format, and call the trained YOLO V4 algorithm to identify the intruders in the image. The identification results will be dynamically displayed in Fig. 10, and the types of intruders will be dynamically displayed under the “Target Identification Results” at the lower right. In addition, the location of the objects in and out of the frame in the identified video and the type of the intrusion at the mark will be changed from the original “Select Video File” button to the “Stop Video Detection” button. Clicking the “Stop Video Detection” button will stop the intrusion target identification of the selected video file and restore the initial interface.

Fig. 10.
figure 10

Recognition results of video target recognition

Click the “Open Camera” button to automatically connect to the camera of the computer, and display the image monitored by the camera in the left box. The identification results will be displayed dynamically. As the camera attached to the computer is connected, there is no way to detect the high-speed railway perimeter image.

6 Summary

In this paper, a video analysis system for high-speed railway peripheral monitoring based on deep learning method is proposed to improve the recognition accuracy of far and small intrusion objects around the perimeter and the real-time performance of target recognition. Through the high-speed railway perimeter long-distance intrusion target recognition experiment, adjust the parameters of the algorithm, according to the established unified evaluation criteria, analyze and optimize the test data to achieve the best target recognition effect. Finally, the target detection algorithm with the best target recognition performance selected is combined with the software to build a high-speed railway perimeter far small intrusion target recognition system, and functional modules of the system are introduced in detail.