Keywords

1 Introduction

ETC gantry data are traffic flow section data, accurately record different types of vehicles and key traffic characteristics such as speed and flow. The accuracy of data is more than 99%, compared to other data sources such as traffic survey data, RTMS (Remote Traffic Microwave Sensor) data. The integrity, accuracy and authenticity of ETC gantry data are better than them. The data can help managers and decision makers to make more effective traffic flow prediction.

In the field of traffic flow prediction, scholars have been proposed parameter models, nonparametric models, hybrid models and machine learning/deep learning models [1]. Neural network models are a subclass of nonparametric model, which have widely interconnected structure and effective learning mechanism to simulate the process of the human brain information processing. Based on a large amount of historical data for training, high prediction accuracy is achieved [2,3,4,5].

In order to improve the prediction accuracy and make full use of ETC gantry data information, this study used RBF neural network model to predict traffic flow based on historical data. At the same time, the accuracy of the prediction curve is guaranteed from the time-varying characteristics of the flow. The proposed method has high prediction accuracy and strong robustness, and provides an idea for the prediction of section traffic flow.

2 ETC Gantry System

In 2019, China vigorously promoted ETC technology on 143,000 km of expressways, cancelled 487 toll stations at provincial boundaries of expressways, built 24,588 sets of ETC gantry systems, reconstructed 48,211 ETC lanes. The number of ETC users reached 2.0400 million. Highway infrastructure and management level to achieve qualitative progress by leaps and bounds, obtain a series of surprising results.

According to the overall technical requirements of the expressway provincial boundary toll stations, ETC gantry systems should be set up between each interchange and entrance/exit of expressways. ETC vehicles and MTC (Manual Toll Collection) vehicles realized segmented tolling. Generating transaction flow (or pass certificate), ETC pass record and captured image information (including license plate number and license plate color, etc.) for ETC vehicles, and timely upload to provincial settlement center and ministry network center. For MTC vehicles, read vehicle information in CPC card (including license plate number, license plate color, model information, etc.), calculate the fee and write it into CPC card, form CPC line record, and upload the captured image information to provincial settlement center and ministry network center in time.

Through the detailed study on the content of the data of ETC gantry systems, it can be found that ETC gantry systems can obtain high-precision information such as traffic flow and interval average speed, and can evaluate the real-time traffic running state of expressways. Identify the traffic congestion near gantry or predict the coming traffic congestion is of great important.

3 Overview of RBF Neural Network

RBF (Radial Basis Function) neural network is a typical feedforward neural network. Its characteristic is that the radial basis function is used as the transformation function of the nodes in the hidden layer, so that the hidden layer can convert the low-dimensional input data into the high-dimensional space, and convert the linear non-separable problem in the low-dimensional space into the linearly separable problem in the high-dimensional space. RBF neural network is usually composed of input layer, hidden layer and output layer, and its structure is shown in the Fig. 1 [6, 7].

Fig. 1
figure 1

Neural network prediction principle

Vector X(\(X = \left( {x_{t} ,x_{t - 1} , \ldots ,x_{t - n} } \right)\)) is the input variable of the network, vector Y(\(Y = \left( {\hat{x}_{t + 1} , \ldots ,\hat{x}_{t + d} } \right)\)) is the output variable of the network, vector W represents the connection weight matrix between the input layer and the hidden layer,.

The prediction process of RBF neural network can be expressed as:

$$w_{j} = {\text{exp}}\left( { - \frac{{x_{t - j} - u_{j}^{2} }}{{2\sigma_{j}^{2} }}} \right)$$
(1)
$$\hat{x}_{t + d} = \sum w_{jd} w_{j}$$
(2)

In this formula, \(w_{j}\) is the output of the jth node in the hidden layer, \(x_{t - j}\) is the traffic flow observation value at \(t - j\) moment, \(\left\| {x_{t - j} } \right. - \left. {u_{j} } \right\|\) is the normal function, \(u_{j}\) is the center of the Gaussian function, and \(\sigma_{j}\) is the variance of the Gaussian function. n indicates the number of nodes of the input layer, m indicates the number of nodes of the hidden layer, d indicates the number of nodes of the output layer, \(w_{jd}\) indicates the connection weight between the output layer and the hidden layer, and \(\hat{x}_{t + d}\) indicates the predicted traffic flow at \(t + d\).

The center vector of the radial basis function \(u_{j} = \left[ {u_{j1} ,u_{j2} , \ldots ,u_{{j\left( {t - n} \right)}} } \right]^{T}\). The kernel width \(\sigma_{j}\) and the connection weights \(w_{jd}\) of hidden layer and output layer are parameters of RBF neural network. \(u_{j}\) and \(\sigma_{j}\) can be determined by FCM clustering algorithm in Eqs. (3) and (4), and the parameter \(w_{jd}\) is obtained by gradient descent learning algorithm.

$$u_{jk} = \mathop \sum \limits_{i = 1}^{n} \mu_{ij} x_{ik} /\mathop \sum \limits_{i = 1}^{n} \mu_{ij}$$
(3)
$$\sigma_{j} = \mathop \sum \limits_{i = 1}^{n} \mu_{ij} x_{i} - u_{j}^{2} /\mathop \sum \limits_{i = 1}^{n} \mu_{ij}$$
(4)

In the formula, \(\mu_{ij}\) represents the fuzzy membership degree of \(x_{i}\) of the sample obtained by FCM clustering algorithm for the jth class, and n represents the training sample size.

Let \(\widetilde{{x_{j} }} = \varphi \left\| {x_{t - j} - } \right.\left. {u_{j} } \right\|,j = 1,2, \ldots ,m\), so

$$\tilde{x} = \left[ {\widetilde{{x_{1} }},\widetilde{{x_{2} }} \ldots ,\widetilde{{x_{m} }}} \right]^{T}$$
(5)

The center \(u_{j}\) and the kernel width \(\sigma_{j}\) of the radial basis function obtained by Eq. (3) and (4) are substituted into Eq. (1) to realize the nonlinear mapping from the input layer to the hidden layer.

Then, start building the model as follows:

Step 1: According to formula (3) and (4), the values of \(u_{j}\) and \(\sigma_{j}\) are obtained, and the input model \(\tilde{x}\) is also created according to formula (5).

Step 2: Introduce ε insensitive loss function.

ε insensitive loss function \(L^{\varepsilon } \left( {x,y,f} \right)\) is defined as

$$L^{\varepsilon } \left( {x,y,f} \right) = \left| {y - f\left( x \right)} \right|_{\varepsilon } = {\text{max}}(0,\left| {y - f\left( x \right)} \right|_{\varepsilon } )$$
(6)

In the formula, \(x \in R^{m}\), \(y \in R\).

For the linear model of formula (6), its corresponding ε insensitive loss function can be expressed as:

$$\mathop \sum \limits_{j = 1}^{n} \left| {y_{j}^{o} - y_{j} } \right|_{\varepsilon } = \mathop \sum \limits_{j = 1}^{n} \max \left( {0,\left| {y_{j}^{o} - y_{j} } \right| - \varepsilon } \right) = \mathop \sum \limits_{j = 1}^{n} \max \left( {0,\left| {p^{T} \widetilde{{x_{j} }} - y_{j} } \right| - \varepsilon } \right)$$
(7)

In the formula, \(y_{j}^{o}\) represents neural network output and \(y_{j}\) represents real output.

Step 3: Prediction.

$$y = p^{T} \varphi \left( {\tilde{x}_{test} } \right) = \lambda \mathop \sum \limits_{j = 1}^{n} \left( {\alpha_{j} - \alpha_{j}^{*} } \right)\varphi^{T} \left( {\widetilde{{x_{j} }}} \right)\left( {\tilde{x}_{test} } \right) = \lambda \mathop \sum \limits_{j = 1}^{n} \left( {\alpha_{j} - \alpha_{j}^{*} } \right)\tilde{K}\left( {\widetilde{{x_{j} }},\tilde{x}_{test} } \right)$$
(8)
$$\begin{gathered} y = \left[ {y_{1} \ldots y_{n} } \right]^{T} ,\alpha = \left[ {\alpha_{1} \ldots \alpha_{n} } \right]^{T} ,\alpha^{*} = \left[ {\alpha_{1}^{*} \ldots \alpha_{n}^{*} } \right]^{T} , \hfill \\ \tilde{K} = \left[ {\tilde{k}\left( {\widetilde{{x_{j} }},\widetilde{{x_{i} }}} \right)} \right] = \left[ {\begin{array}{*{20}c} {K + \frac{\mu n}{\lambda }I} & { - K} \\ { - K} & {K\frac{\mu n}{\lambda }I} \\ \end{array} } \right] \hfill \\ \end{gathered}$$
(9)

4 Experimental Verification

4.1 Basic Data Description

The data used in this study is the ETC gantry data of Shandong province. Firstly, the protocol and format of ETC gantry data are studied, and the data are initialized to determine the parameters of the model. On this basis, the prediction effect of the model was evaluated by root mean square error (RMSE), mean absolute percentage error (MAPE) and mean absolute error (MAE).

Basic data

In order to verify the validity and correctness of the model, three typical gantries of Beijing-Shanghai Expressway in Shandong Province were selected as the research objects, and the traffic flow during holidays was predicted. Among them, three gantry numbers are G000237011000320070, G000237011000420080 and G000237011000510040, all of which are uplink gantries. In this study, the historical gantry data of 8 days from September 29 to October 6, 2020 were selected to predict the traffic flow on October 7 and 8.

The ETC gantry data selected in this study includes ETC transaction flow (double-chip OBU), ETC traffic record (transaction failure), image flow record and CPC card record, among which the image flow record is the auxiliary data. Take ETC transaction flow (double-chip OBU) as an example, and its sample data are shown in Fig. 2.

Fig. 2
figure 2

An example of ETC transaction flow (two-chip OBU) data

Data characteristics

The changes of traffic flow of gantry G000237011000510040 within the range affected by holidays in 10 days is shown in the Fig. 3. The average daily traffic flow value increases from September 29, and reaches its maximum value on October 2. It becomes stable on October 3 and 4, and gradually decreases on October 5, which conforms to the traffic flow rule of National Day holiday.

Fig. 3
figure 3

Gantry G000237011000510040 traffic changes per minute during holidays

4.2 Prediction Results

The training process of all models was realized in MATLAB R2020b. The calculation formulas of the prediction result evaluation index are as follows:

$$MAE = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {\hat{v}_{i} - v_{i} } \right|$$
(10)
$$MAPE = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \frac{{\left| {\hat{v}_{i} - v_{i} } \right|}}{{v_{i} }}{*}100{\text{\% }}$$
(11)
$$RMSE = \sqrt {\mathop \sum \limits_{i = 1}^{n} \frac{{\left( {\hat{v}_{i} - v_{i} } \right)^{2} }}{n}}$$
(12)

As shown in this formula, \(\hat{v}_{i}\) indicates the predicted value of traffic flow at time i, and \(v_{i}\) indicates the observed value of traffic flow at time i.

Figure 4 shows a comparison of the traffic flow of three gantries on holidays by using BP, ELMAN and RBF algorithms. It can be seen from the figure that the prediction results of the three gantries obtained by RBF neural network algorithm are better than the other two methods, and the prediction errors are different due to the complexity of the sections where the three gantries belongs. The prediction results of G000237011000510040 gantry are significantly better than the other two gantries.

Fig. 4
figure 4

Comparison of holiday flow forecast results of BP, ELMAN and RBF

Table 1 is a summary of the average error of traffic prediction obtained by using the three algorithms. It can be seen from the table that the RBF neural network algorithm has the best prediction result among the three algorithms, followed by BP and ELMAN neural network algorithm. The MAE of G000237011000320070 holiday traffic obtained by the algorithm proposed in this study is within 75 veh/min, the MAE of G000237011000420080 is within 55 veh/min, the MAE of G000237011000510040 is within 32 veh/min. The RMSE of the three gantries were all within 6 veh/min, and the MAPE were all less than 4.5%.

Table 1 Holiday traffic prediction error

5 Conclusion

Considering that ETC gantry data contains a lot of information, this study proposes an RBF neural network algorithm, which uses historical traffic flow data trend to predict the trend of gantry section flow.

  1. (1)

    RBF neural network algorithm is adopted to predict the changing trend of flow of gantry section on holidays by using historical trend, which increases the prediction robustness of the model.

  2. (2)

    RBF neural network algorithm is used to predict the flow of gantry section during holidays: three gantries of Beijing-Shanghai Expressway in Shandong province were selected and BP, ELMAN and RBF neural network algorithms were used to predict, which proves the superiority of the proposed algorithm in this study. The MAE were less than 75veh/min, and the RMSE were less than 6veh/min, and the MAPE were less than 4.5%.